Autocorrelation: Causes and Consequences

Autocorrelation is a situation where the error terms in the model are correlated or dependent on each other. Error in the given period, therefore, is determined by errors in the previous periods. It is a common problem faced by economists because of the nature of time-series data, where successive values of variables and error terms affect current period values.

One of the assumptions of OLS states that there should be no autocorrelation in the model. It is also referred to as serial independence. Serial independence means that the error term in any given period is independent of error terms of other periods. That is, error terms do not affect each other.

It is important to note that autocorrelation may also exist in cross-sectional data, but, it is mostly observed in time series analysis. It is rarely observed in cross-sectional data because the error term associated with one observation generally doesn’t affect other observations. Consider the example of consumption, change in consumption of one household due to a random factor (like winning a lottery) will not affect the consumption of other households.

Autocorrelation vs Serial Correlation

The terms autocorrelation and serial correlation are usually used synonymously. But, there is a theoretical distinction between the two. Autocorrelation occurs when the current value of a time series variable is dependent on its own lagged values. On the other hand, serial correlation is a situation where the current value of a time series variable is dependent on the lagged values of another time series.

Causes Of Autocorrelation

  • Omission of important independent variable: if an important independent variable is not included in the model, then its effects are captured in the error term. As a result, errors will become dependent on each other if the excluded variable is autocorrelated. Time series variables tend to be autocorrelated due to their inherent nature. For example, income in the current period generally depends on the previous period’s income. If the income variable is excluded from the consumption function, the model may end up being affected by autocorrelation. This is because the excluded income variable is most likely autocorrelated and will be captured by the error terms.
  • Nature of problem: in some cases, error terms are autocorrelated due to the nature of economic phenomenon. Let us consider an example of agricultural production and droughts. Drought, here, is a random factor that might end up causing autocorrelation. This is because the drought will affect production not only in the current period but also in the next periods. Hence, autocorrelation may exist because certain random factors have their effects over a longer period or across more than one time periods.
  • Mis-specification of functional form: a wrong functional form of the model may also cause autocorrelation. For example, if the true relationship between variables is cyclical, but, the model uses a linear functional form. In such a case, the error terms might become correlated as the cyclical effects are not addressed by the explanatory variables.
  • Non-stationarity: a time series is stationary if its features (such as mean and variance) are constant over a given period of time. If the time series variables in a model are non-stationary, then, the error term may also be non-stationary. As a result, the error term might be autocorrelated with varying characteristics (mean, variance, covariance etc.) over time.

Consequences Of Autocorrelation

  • Underestimation of Residual Variance: the variance of the error term is underestimated if the errors are autocorrelated. If the autocorrelation is positive, then this problem can become even more serious.
  • Variance of estimates: in the presence of autocorrelation, the estimates of Ordinary Least Squares or OLS are still unbiased. However, the variance of the estimates will likely be larger than in the case of other methods. The true variance of parameters becomes large in the presence of autocorrelation, while the estimated variance using the OLS formula is underestimated. The variance of coefficients is smaller than their true variance. That is, we have an underestimation of the parameter or coefficient variance.
  • Tests of significance: usual tests of significance like standard errors, t-test and F-test will give misleading results and are no longer reliable because the variance of coefficients is underestimated. These tests cannot be used to determine the statistical significance of coefficients. The estimated standard errors will be smaller because of underestimated variance, thereby, affecting t-values as well.
  • Predictions: in autocorrelation, the predictions made by the OLS model will be inefficient. This means that the predictions will have a larger variance as compared to predictions from other models like GLS.

Detection

Durbin-Watson d Statistic

This test is applicable to the first-order autoregressive scheme where the error terms are dependent on the previous one-period error only. The first-order autoregressive scheme means that the error is dependent on the previous one-period error.

For such a model, the Durbin-Watson d statistic can be estimated as follows:

Durbin Watson d statistic for Autocorrelation

Therefore, the value of this Durbin-Watson d statistic lies between 0 and 4. The value of d = 2 means that there is no autocorrelation and ρ = 0. On one extreme, the value of d will be 0 when ρ = +1, which implies a perfect positive autocorrelation. On the other extreme, when ρ = -1, d will be equal to 4, which indicates a perfect negative autocorrelation.

However, this test has some serious limitations:

  • It can be used to test first-order autocorrelation scheme only.
  • The error terms must be normally distributed.
  • The lagged values of the dependent variable must not be included in the model.

Bresuch-Godfrey LM Test

This test tries to overcome the limitations of the Durbin-Watson d statistic. It is applicable to higher-order autoregressive schemes, moving average models and models with lagged values of the dependent variable.

This model is autoregressive of order “p” or pth-order autoregressive scheme because it includes “p” lagged values of the error term.

The LM test follows the following procedure:

Step 1: estimate the simple OLS model and obtain residuals μ ̂t

Step 2: regress μ ̂t on all the independent variables (Xt in this example) and lagged values of error term determined by the order of autoregressive scheme (μ(t-1), μ(t-2),…, μ(t-p) in this example).

Step 3: obtain R2 of this regression.
Step 4: in a large sample size,

If this value exceeds the critical chi-square value at the chosen level of significance, at least one of the lagged values of the error term is significant. Therefore, we reject the null hypothesis of no autocorrelation and we know that autocorrelation exists.

Autocorrelation Function, ACF and PACF plots

We can also use the autocorrelation function on the residuals to detect the presence of autocorrelation. Additionally, autocorrelation function plots (ACF plots) and partial autocorrelation function plots (PACF plots) can be used to understand the autoregressive and moving average behaviour of the time series. Usually, these methods help in assessing the stationarity of time series variables. You can learn more about these methods from Autocorrelation Function and Interpreting ACF and PACF Plots.

Solutions

The omission of important explanatory variables or wrong functional forms can cause autocorrelation. In such cases, including the omitted variables or trying different functional forms can eliminate the problem. However, dealing with pure autocorrelation is not straightforward because it can be inherent in time series variables. Some of the methods that can be employed are as follows:

Newey-West Standard Errors

Newey-West standard errors are also known as Heteroscedasticity and Autocorrelation consistent (HAC) standard errors. As the name suggests, these standard errors are corrected for both heteroscedasticity and autocorrelation and are an extension of White’s Heteroscedasticity consistent standard errors.

Because Newey-West standard errors correct for autocorrelation as well as heteroscedasticity, the usual t-test and p-values can be used to determine the significance of coefficients. Newey-West standard errors are larger than OLS standard errors because of underestimation by OLS standard errors.

Note: Newey-West standard errors should be applied to large samples only. They should not be used in small samples.

Feasible Generalized Least Squares

The application of Generalised Least Squares in the presence of autocorrelation is also referred to as Feasible Generalised Least Squares (FGLS). The idea of this method is similar to Weighted Least Squares (WLS) in the presence of Heteroscedasticity. The variables are assigned weights in such a manner that the model is no longer affected by autocorrelation.

The weights are determined by the autocorrelation coefficient (ρ) and are used to transform the model. Let us consider the simple two-variable model with a first-order autoregressive scheme, i.e. AR(1).

In practice, the “p” coefficient is unknown and must be estimated. Several methods can be used to accomplish that such as the Durbin-Watson d statistic or the Cochrane-Orcutt method. If the “p” coefficient is almost one, we can simply use the first differences.

Time Series Models

Time series models are one of the most essential techniques for economists involving the study of stationarity, cointegration and specialised models to deal with time series. These include ARMA, ARIMA and SARIMA models. Furthermore, multiple equation models for time series such as VAR and VECM can be employed by economists to study multiple relationships and cointegration.

Leave a Reply