All about it: Time series analysis — Box -Jenkins, Holt-Winter methods

Sreeram Kashyap
Analytics Vidhya
Published in
5 min readJan 31, 2021

--

Hi everyone. This is a continuation of my time series analysis explanation series. In this article, we will discuss some old school methods used for time series analysis. I first saw this method in my manufacturing class being used for demand forecasting and later in my data analytics coursework for sales forecasting and made a quick deduction of how important these algorithms were before the advent of deep learning frameworks. The main advantage of these methods other than being fairly accurate, fast to calculate, and easy to do are that they are more interpretable i.e. you have an explanation for all the results obtained. That’s the reason they have been used for a really really really long time in the industry.

That being said, let’s jump in.

In this article, we will discuss in detail the two main methods in time series analysis

  1. Winter Holt’s method (Exponential smoothing)
  2. Box Jenkins method (ARIMA model)

Exponential smoothing:
The Exponential smoothing method is an extension to the simple moving averages method used for forecasts. I’ll describe it in short which will make it easy for you to understand the exponential smoothing method.

Moving averages Method:
It makes a forecast in the current time step based on the values recorded in the last n time steps.

Let p1, p2,p3….. pn be the values recorded in n time steps.

The value in time t is given by

Moving averages

e.g.
lets say p1 = 5, p2 = 4, p3 = 6
then pt = (5+4+6)/3 = 5

Exponential smoothing:

Intuition will tell you here that this method can be wrong because the values can be varying over time.

For instance, if you take demand forecasts for 10 months and use a simple moving average to forecast the 11th month you’ll quickly think that each month cannot contribute equally to the forecast. It will be dependent more on the last month than the first month (most naively thinking). So you’ll want to assign different weight values to the values. This idea gives rise to exponential smoothing.

Formula:

Suppose we have series of values P1, P2, P3……Pn and we want to estimate the value Pt for time t. By the exponential smoothing formula, Pt is given by

Exponential smoothing

Where, α = smoothing factor 0<α<1,

t ≥ 3

Example:

Let y1 = 71, y2 = 70, y3 = 69, y4 = 68, y5 = 64
Let α = 0.1
In any forecasting, we can’t predict the first value because there is no historical data. In exponential smoothing, forecasting starts with the third time step.
First set P2 to y1.
P3 = 0.1 * Y2 + (1–0.1) * P2 = 70.9

Results for p4 and p5 re shown in the below figure.

Forecasts with moving averages

Here you can see that we have set α to 0.1. Usually, it needs to be tested for a range of values from 0.1 to 0.9. The error between forecasts with the α values and the known values help us determine which value to use.

Note: The method discussed here is single exponential smoothing. This was improved to propose double and triple exponential smoothing which give much better forecasts. They will be discussed in detail in a separate article.

Box Jenkins Methods:

Now let’s discuss the box Jenkins method is more elaborate than exponential smoothing.

The box Jenkins method requires some key points to be discussed.

Stationarity: A data is stationary if its mean, variance, and autocorrelation don’t change over time for a given time.

The box Jenkins method assumes that the data being used is stationary. So before we use the data with any box Jenkins method we need to convert the data to stationary.

Detecting stationarity: This can be done by using autocorrelation plots or run sequence plots. Usually, autocorrelation plots are used because they are easy to understand and use.

Autocorrelation plots: Autocorrelation plots show whether the components of a time series data are positively or negatively correlated to each other (hint: auto = self). Autocorrelation function (ACF) values can range between -1 and 1.

Interpreting autocorrelation plots: a plot with lag n gives the correlation between the value being considered and the value recorded n time steps earlier. So, autocorrelation with lag 0 is always 1.

If the autocorrelation plot shows significant peaks that differencing is used to make the data stationary. With differencing the correlation increases.

Now that the data is stationary, we can continue with the forecasts.

There are two main methods in the box Jenkins method.

Autoregressive Moving averages (ARMA) method:

The forecasts for this model are given by combining the autoregressive model and moving averages model.

ARMA model

Where,
Θ = Moving average parameter
ϕ = Autoregressive model parameter
ε = error terms
c = constant
Notation: ARMA(p,q) for p autoregressive terms and q moving-average terms

Autoregressive Integrated Moving averages (ARMA) method:

In this method on top of p autoregressive components and q moving average components, we define a degree of differencing d which gives the order by which we perform differencing to achieve stationarity in the data being used.

The forecasts in the ARIMA model are given by

ARIMA model

Where,

L = lag operator

Notation: ARIMA(p,d,q)

Note: Partial autocorrelation plots and autocorrelation plots are used for selecting the p and q values respectively. Similarly, Extended autocorrelation plots can be used as well.

An implementation of the ARIMA model using python will be explained in the next article.

******************************************************************

…………………………………..Stay Tuned………………………………………

*******************************************************************

--

--