background preloader

An online textbook by Rob J Hyndman and George Athanasopoulos

An online textbook by Rob J Hyndman and George Athanasopoulos
Welcome to our online textbook on forecasting. This textbook is intended to provide a comprehensive introduction to forecasting methods and to present enough information about each method for readers to be able to use them sensibly. We don’t attempt to give a thorough discussion of the theoretical details behind each method, although the references at the end of each chapter will fill in many of those details. The book is written for three audiences: (1) people finding themselves doing forecasting in business when they may not have had any formal training in the area; (2) undergraduate students studying business; (3) MBA students doing a forecasting elective. We use it ourselves for a second-year subject for students undertaking a Bachelor of Commerce degree at Monash University, Australia. For most sections, we only assume that readers are familiar with algebra, and high school mathematics should be sufficient background. Use the table of contents on the right to browse the book.

Related:  Time SeriesForecasting

What is volatility? Some facts and some speculation. Definition Volatility is the annualized standard deviation of returns — it is often expressed in percent. A volatility of 20 means that there is about a one-third probability that an asset’s price a year from now will have fallen or risen by more than 20% from its present value. In R the computation, given a series of daily prices, looks like: sqrt(252) * sd(diff(log(priceSeriesDaily))) * 100

Forecasting within limits Forecasting within limits It is com­mon to want fore­casts to be pos­i­tive, or to require them to be within some spec­i­fied range . Both of these sit­u­a­tions are rel­a­tively easy to han­dle using transformations. Pos­i­tive forecasts To impose a pos­i­tiv­ity con­straint, sim­ply work on the log scale. Ebook Archive: Internet Archive Additional collections of scanned books, articles, and other texts (usually organized by topic) are presented here. United States Patent and Trademark Office documents contributed by Think Computer Foundation. Topic: U.S Patent The American Libraries collection includes material contributed from across the United States. Institutions range from the Library of Congress to many local public libraries. As a whole, this collection of material brings holdings that cover many facets of American life and scholarship into the public domain.

CSV To SQL Converter Convert CSV to SQL Use this tool to convert CSV to SQL statements. From CSV Time Series Analysis Any metric that is measured over time is a time series. It is of high importance because of industrial relevance especially w.r.t forecasting (demand, sales, supply etc). It can be broken down to its components so as to systematically forecast it. Interpreting noise When watch­ing the TV news, or read­ing news­pa­per com­men­tary, I am fre­quently amazed at the attempts peo­ple make to inter­pret ran­dom noise. For exam­ple, the lat­est tiny fluc­tu­a­tion in the share price of a major com­pany is attrib­uted to the CEO being ill. When the exchange rate goes up, the TV finance com­men­ta­tor con­fi­dently announces that it is a reac­tion to Chi­nese build­ing con­tracts. No one ever says “The unem­ploy­ment rate has dropped by 0.1% for no appar­ent reason.” What is going on here is that the com­men­ta­tors are assum­ing we live in a noise-​​free world. They imag­ine that every­thing is explic­a­ble, you just have to find the expla­na­tion.

Debunking Handbook, The Posted on 27 November 2011 by John Cook The Debunking Handbook, a guide to debunking misinformation, is now freely available to download. Although there is a great deal of psychological research on misinformation, there's no summary of the literature that offers practical guidelines on the most effective ways of reducing the influence of myths. The Debunking Handbook boils the research down into a short, simple summary, intended as a guide for communicators in all areas (not just climate) who encounter misinformation.

D3.js Resources to Level Up I have gotten a lot better at D3.js development over the past few years, and can trace most of my improvement to coming across a few key tutorials, blogs, books and other resources on the topic. They’ve been a huge help for me, and I’ve gathered a bunch of my favorites in this post to hopefully help others improve their D3 experience. Here it goes: R Video tutorial for Spatial Statistics: Introductory Time-Series analysis of US Environmental Protection Agency (EPA) pollution data Download EPA air pollution data The US Environmental Protection Agency (EPA) provides tons of free data about air pollution and other weather measurements through their website. An overview of their offer is available here: The data are provided in hourly, daily and annual averages for the following parameters: Ozone, SO2, CO,NO2, Pm 2.5 FRM/FEM Mass, Pm2.5 non FRM/FEM Mass, PM10, Wind, Temperature, Barometric Pressure, RH and Dewpoint, HAPs (Hazardous Air Pollutants), VOCs (Volatile Organic Compounds) and Lead. All the files are accessible from this page: The web links to download the zip files are very similar to each other, they have an initial starting URL: and then the name of the file has the following format: The type can be: hourly, daily or annual. The properties are sometimes written as text and sometimes using a numeric ID.

Errors on percentage errors The MAPE (mean absolute per­cent­age error) is a pop­u­lar mea­sure for fore­cast accu­racy and is defined as where denotes an obser­va­tion and denotes its fore­cast, and the mean is taken over Arm­strong (1985, p.348) was the first (to my knowl­edge) to point out the asym­me­try of the MAPE say­ing that “it has a bias favor­ing esti­mates that are below the actual val­ues”. A few years later, Arm­strong and Col­lopy (1992) argued that the MAPE “puts a heav­ier penalty on fore­casts that exceed the actual than those that are less than the actual”. historic documents in computer science Fortran Fortran Automated Coding System For the IBM 704 the very first Fortran manual, by John Backus, et al., Oct. 1956 Al Kossow has in his manual collection also an IBM 704 manual, if you want to have a look at the machine that this original Fortran language was made for. Also the next IBM manuals are from the collection at his web site, where you can find a number of Fortran manuals, for many machines of various manufacturers. The FORTRAN II General Information Manual and IBM 7090/7094 Programming Systems: FORTRAN II Programming are two IBM manuals of 1963 describing the FORTRAN II language. IBM 7090/7094 Programming Systems: FORTRAN IV Language, 1963, and IBM System 360 and System 370 FORTRAN IV Language, 1974. FORTRAN IV was next to the ANSI standard of 1966 for long time the reference language for Fortran as used by legions of scientists and engineers. Algol60 I think, this is the original Peter Naur edition of the Algol 60 report.

21 tools that will help your remote team work better together - Page 20 of 20 Meldium Securely sharing passwords with people in your team across the Internet is no easy feat. Getting your team on Meldium means you have control over who has access to what and passwords are never exposed to team members. Instead, Meldium offers single sign-on so they can log in with one click without ever seeing the actual password. Meldium works with Internet Explorer, Firefox, Chrome, iOS and Android.

Time series outlier detection (a simple R function) (By Andrea Venturini) Imagine you have a lot of time series – they may be short ones – related to a lot of different measures and very little time to find outliers. You need something not too sophisticated to solve quickly the mess. This is – very shortly speaking – the typical situation in which you can adopt washer.AV() function in R language. In this linked document (washer) you have the function and an example of actual application in R language: a data.frame (dati) with temperature and rain (phen) measures (value) in 4 periods of time (time) and in 20 geographical zones (zone). (20*4*2=160 arbitrary observations). > dati