All about it: Time series analysis — Introduction

Sreeram Kashyap
Analytics Vidhya
Published in
4 min readJan 30, 2021

--

TIME SERIES ANALYSIS

In this series of articles I am going to discuss about a major type of data that we find in many real time situations — the time series data. Since it is quite a huge topic I will be discussing it in multiple articles. But as we move forward you should be able to go from knowing nothing about it to completing at least one project with time series analysis in python and R. So stay tuned.

Data related to an event collected over long periods of ‘time’ often shows an underlying trend which can help us find patterns in the data and try and predict the behavior, of the entity under discussion, at a time in near future. This kind of data is termed as time series data. Let’s see the definition for time series data first.

Definition: An ordered sequence of values of a variable at equally spaced time intervals.

The data is usually indexed with timestamps. Stock market data is a great example of time series data. See the chart for Tesla stock price for an example

Types of data:

Time series data regarding an event is mainly of two types.

Univariate time series: Measurement of same variable collected over time. This consists of a single series that we need to study to understand.

Multi variate time series: Measurement of multiple variables at each timestamp recorded. This is called cross-sectional data. It consists of multiple time series data related to each other. This is much more complex than univariate analysis. Usually any data we record in real life should be a cross-sectional dataset to understand it completely.

Components of time series data:

Any time series data is a combination of following main components.

· Trend: It shows the overall change in the dataset. It shows whether, on average, the variable increases or decreases with time. A trend is usually observed when data is available for larger durations in relevance to the event under discussion. For instance you can see that the tesla stock increased over the past 5 years.

· Seasonality: It shows any periodic changes in the data. In other words, any regularly repeating low or high points in the data constitute seasonality. For instance the stock reaching a peak higher than the average once a year is a seasonality.

· Irregular changes/outliers/anomalies: There can be times at which the stock increases or decreases suddenly by a large amount, which is the kind of behavior not observed before in the dataset. These constitute an irregular movement/anomaly. Anomalies have been a huge area of research and discussion because identifying and eliminating anomalies play a key role in making accurate forecasts. Anomalies can cause the forecasts to be skewed by unpredictable amounts. So finding and eliminating anomalies constitutes major part of any data scientist working with time series data.

Applications of time series analysis:

Till now I have been referencing stock market to explain everything which may cause any assumptions that stock market is the prime area for time series analysis. If that has built up, then let’s clear it now and try to estimate a scope of the problem.

Time series analysis is a key component in any industry which has events which show time dependent behavior.

(The statement might not be clear in the first glance, but hang on!)

Some instances using time series analysis can be listed as follows.

1. Sales forecasting e.g. sales in retail stores

2. Economic Forecasting: e.g. consumption in a region

3. Medical research: e.g.: predicting cardiac arrest from cardiograph

4. Manufacturing: e.g.: Predicting machine breakdowns, tool wear and tear

5. Process and Quality Control: e.g.: control charts

fig: Cardiographs : time series analysis of cardiographs can help predict a cardiac arrest

This should have given you a pretty good idea that time series analysis is an important area not just in stock market but in many other fields.

Note: The above list of applications is not exhaustive. There are many existing and many more sprouting every day.

Methods/Techniques: Now let’s see the methods used in time series analysis in brief. I will discuss each method and more methods in detail in future articles.

Forecasting time series can be quite tiring to do. But it gives some pretty serious results. Historically 2 main methods have been known for time series analysis.

1. Box-Jenkins method

It is named after statisticians George Box and Gwilym Jenkins.

2. Holt-Winter exponential smoothing method

While these have been used historically in the industry for reliable forecasting, there is now lot of work available on time series forecasting using machine learning and deep learning. For instance LSTM and transformers have been experimented quite a lot for time series forecasting and have shown promising results as well. But due to many reasons like interpretability and others adaptation has been low even though the scenario is changing now.

We will see all the methods including LSTM and transformers in future articles.

…………………………………..Stay tuned…………………………………

References: Information gathered from following sources among others 1.1 Overview of Time Series Characteristics | STAT 510 (psu.edu), CRAN Task View: Time Series Analysis (r-project.org), NIST/SEMATECH e-Handbook of Statistical Methods, Time series — Wikipedia,

--

--