background preloader

How To Identify Patterns in Time Series Data: Time Series Analysis

How To Identify Patterns in Time Series Data: Time Series Analysis
In the following topics, we will first review techniques used to identify patterns in time series data (such as smoothing and curve fitting techniques and autocorrelations), then we will introduce a general class of models that can be used to represent time series data and generate predictions (autoregressive and moving average models). Finally, we will review some simple but commonly used modeling and forecasting techniques based on linear regression. For more information see the topics below. General Introduction In the following topics, we will review techniques that are useful for analyzing time series data, that is, sequences of measurements that follow non-random orders. Detailed discussions of the methods described in this section can be found in Anderson (1976), Box and Jenkins (1976), Kendall (1984), Kendall and Ord (1990), Montgomery, Johnson, and Gardiner (1990), Pankratz (1983), Shumway (1988), Vandaele (1983), Walker (1991), and Wei (1989). Two Main Goals Trend Analysis Where: Related:  ANOVA & t-tests

Assumptions of Statistical Tests “All models are incorrect. Some are useful.” George Box When you do a statistical test, you are, in essence, testing if the assumptions are valid. We are typically only interested in one, the null hypothesis. That is, the assumption that the difference is zero (actually it could test if the difference were any amount). Let me focus on the lowly t-test and also a simple two-way ANOVA (comparing two groups and time [repeated measurements]). A second assumption is that the data are normally distributed. So, how bad is the effect of non-normal data? The dotted line represents a true normal distribution what we hope to eventually see. This is also the reason why, when you have a number of small effects culminating in a parameter, that it tends to be normally distributed. One cause of non-normality is outliers, or extreme values. If the data are skewed, one reason might be that the measuring instrument was not constructed to differentiate on one or both ends of the scale.

Time series Time series: random data plus trend, with best-fit line and different applied filters A time series is a sequence of data points, measured typically at successive points in time spaced at uniform time intervals. Examples of time series are the daily closing value of the Dow Jones Industrial Average and the annual flow volume of the Nile River at Aswan. Time series are very frequently plotted via line charts. Time series are used in statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, and communications engineering . Time series data have a natural temporal ordering. Time series analysis can be applied to real-valued, continuous data, discrete numeric data, or discrete symbolic data (i.e. sequences of characters, such as letters and words in the English language.[2]). Methods for time series analyses[edit] Analysis[edit] Motivation[edit] Classification[edit]

Online Statistics Education: A Free Resource for Introductory Statistics Developed by Rice University (Lead Developer), University of Houston Clear Lake, and Tufts University OnlineStatBook Project Home This work is in the public domain. If you are an instructor using these materials, I can send you an instructor's manual, PowerPoint Slides, and additional questions that may be helpful to you. Table of Contents Mobile This version uses formatting that works better for mobile devices. Rice Virtual Lab in Statistics This is the original classic with all the simulations and case studies. Version in PDF e-Pub (e-book) Partial support for this work was provided by the National Science Foundation's Division of Undergraduate Education through grants DUE-9751307, DUE-0089435, and DUE-0919818.

Introduction to the Scientific Method Introduction to the Scientific Method The scientific method is the process by which scientists, collectively and over time, endeavor to construct an accurate (that is, reliable, consistent and non-arbitrary) representation of the world. Recognizing that personal and cultural beliefs influence both our perceptions and our interpretations of natural phenomena, we aim through the use of standard procedures and criteria to minimize those influences when developing a theory. As a famous scientist once said, "Smart people (like smart lawyers) can come up with very good explanations for mistaken points of view." In summary, the scientific method attempts to minimize the influence of bias or prejudice in the experimenter when testing an hypothesis or a theory. I. 1. 2. 3. 4. If the experiments bear out the hypothesis it may come to be regarded as a theory or law of nature (more on the concepts of hypothesis, model, theory and law below). II. Error in experiments have several sources. III. IV. V.

Introduction to ANOVA Introduction to ANOVA (Jump to: Lecture | Video ) An ANOVA has factors(variables), and each of those factors has levels: There are several different types of ANOVA: There are four main assumptions of an ANOVA: Hypotheses in ANOVA depend on the number of factors you're dealing with: Effects dealing with one factor are called main effects. Here's an example of an interaction effect in an ANOVA: Below we have a Factorial ANOVA with two factors: dosage(0mg and 100mg) and gender(men and women) . Dosage and gender are interacting because the effect of one variable depends on which level you're at of the other variable. If we reject the null hypothesis in an ANOVA, all we know is that there is a difference somewhere among the groups. When performing an ANOVA, we calculate an "F" statistic. If there are no treatment differences (that is, if there is no actual effect), we expect F to be 1.

Information by Eurostat and National Bank of Belgium Download Demetra+ In case of any problems or questions, please contact Eurostat Unit B2 Methodology and Research at : To download documentation about Demetra+ from CROS portal please click here Reference page to ESS guidelines on Seasonal adjustment here For info about the next ESTP course on DEMETRA+ for beginners and advanced users that will be held at Eurostat please consult here Description Seasonal adjustment is an important step of the official statistics business architecture and harmonisation of practices has proved to be key element of quality of the output. In 2008, ESS (European Statistical System) guidelines on SA have been endorsed by the CMFB and the SPC as a framework for seasonal adjustment of PEEIs and other ESS and ESCB economic indicators. Demetra+ is a family of modules on seasonal adjustment, which are based on the two leading algorithms in that domain (TRAMO&SEATS@ / X-12-ARIMA). Features Future plans

“The Daily Rind”, a Better Way to Plan the Day — A. King in Society Photo: A sample “daily rind” from my notebook For years my task and schedule management lived across various apps — OmniFocus, Basecamp, Google Calendar, and others (and more recently, as I pared down my “productivity” tools, a simple combination of The Hit List + iCal.) But mapping out what to do throughout my day in a reliable way has always been a problem. Really understanding how little time there was and seeing patterns in time usage proved next to impossible, despite all the technology at my fingertips. I think I’ve found a better way. I still track my projects and tasks digitally, and keep a calendar (with online sync + backup) for planning ahead, but for mapping out what I’m going to do in the day ahead of me, I’ve devised a decidedly low-tech system which I’m lovingly referring to as “The Daily Rind.” Re-introducing Analog -or- Can’t Get No Satisfaction? I’d never really been attracted to using a paper-based day-planner. Hacking the Muji Chronotebook 1. 2. 3. The First Rind — Day

The Statistics Homepage "Thank you and thank you again for providing a complete, well-structured, and easy-to-understand online resource. Every other website or snobbish research paper has not deigned to explain things in words consisting of less than four syllables. I was tossed to and fro like a man holding on to a frail plank that he calls his determination until I came across your electronic textbook...You have cleared the air for me. You have enlightened. You have illuminated. You have educated me." — Mr. "As a professional medical statistician of some 40 years standing, I can unreservedly recommend this textbook as a resource for self-education, teaching and on-the-fly illustration of specific statistical methodology in one-to-one statistical consulting. — Mr. "Excellent book. — Dr. "Just wanted to congratulate whoever wrote the 'Experimental Design' page. — James A. Read More Testimonials >> StatSoft has freely provided the Electronic Statistics Textbook as a public service since 1995. Proper citation:

Problem of alpha inflation The main problem that designers of post hoc tests try to deal with is -inflation. This refers to the fact that the more tests you conduct at = .05, the more likely you are to claim you have a significant result when you shouldn't have (i.e., a Type I error). i.e., a 40.1% chance of making a Type I error somewhere among your six t-tests (instead of 5%)!! The overall chance of a Type I error rate in a particular experiment is referred to as the experimentwise error rate (sometimes called Familywise error rate). How do I fit an ARIMA model to a time series with XLSTAT-Time? - Tutorials - XLSTAT Dataset to fit an ARIMA model to a time series An Excel sheet with both the data and results can be downloaded by clicking here. The data have been obtained in [Box, G.E.P. and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco], and correspond to monthly international airline passengers (in thousands) from January 1949 to December 1960. We notice on the chart, that there is a global upward trend, that every year a similar cycle starts, and that the variability within a year seems to increase over time. We can now fit an ARIMA(0,1, 1)(0,1,1)12 model which seems to be appropriate to remove the trend effect and the yearly seasonality of the data. Setting up the fitting of an ARIMA model to a time series After opening XLSTAT, select the XLSTAT / XLSTAT-Time / ARIMA command, or click on the corresponding button of the "XLSTAT-Time" toolbar (see below). Once you've clicked on the button, the ARIMA dialog box will appear. The ARIMA model writes:

Preparing for a Post Peak Life | Post Peak Living Version 3 released February 15, 2011 with new discussion on the global debt problem. No time to watch now? Watch the first ten minutes, which are a complete synopsis, then come back for the rest. Click on the lower right to expand the video to full screen. The previous version of this presentation is available in English here and in Slovak here. After Watching the Video If you're interested in learning more or are ready to being preparing, you can: Enroll in the Start Now Mini Course (look at the sidebar to the right). Notes Right-click here and choose "Save Target As..." Slide: What the Next Decade Will Bring Does peak oil mean we don't have to worry about climate change? Two meter sea level rise unstoppable (Reuters) Heat waves and extremely high temperatures could be commonplace in the U.S. by 2039, Stanford study finds ( Slide: Peak Oil: Supply Falls Short of Demand Slide: U.S. The United States currently uses 25% of the world oil production but has only 2% of world reserves.

Margin of Error and Confidence Levels Made Simple Pamela Hunter February 26, 2010 A survey is a valuable assessment tool in which a sample is selected and information from the sample can then be generalized to a larger population. Surveying has been likened to taste-testing soup – a few spoonfuls tell what the whole pot tastes like. The key to the validity of any survey is randomness. Just as the soup must be stirred in order for the few spoonfuls to represent the whole pot, when sampling a population, the group must be stirred before respondents are selected. It is critical that respondents be chosen randomly so that the survey results can be generalized to the whole population. How well the sample represents the population is gauged by two important statistics – the survey’s margin of error and confidence level. In other words, Company X surveys customers and finds that 50 percent of the respondents say its customer service is “very good.” Sample Size and the Margin of Error Calculating Margin of Error for Individual Questions

determine sample size two-way ANOVA? Computing required sample size for experiments to be analyzed by ANOVA is pretty complicated, with lots of possiblilities. To learn more, consult books by Cohen or Bausell and Li, but plan to spend at least several hours. Two-way ANOVA, as you'd expect, is more complicated than one-way. The complexity comes from the many possible ways to phrase your question about sample size. There are two levels of the first factor, say the factor is Drug and you either gave the drug or gave vehicle (placebo). If those limitations aren't a problem for you, then read on for a simple way to compute necessary sample size. Sample size is always determined to detect some hypothetical difference. What about units? Another way to look at this is to express the difference you expect to see as a fraction of the mean. Step 2 is to divide the result in step 1 by 2.00 to get the standardized effect size ES.