Ordinary Least Squares Estimation

OLS or Ordinary Least Squares is one of the most common methods used in Econometrics. It is a linear regression technique that minimizes the sum of squared residuals (error term) to estimate the coefficients. In this post, we will dive further into the practical process of estimating the coefficients using Ordinary Least Squares.


The Ordinary Least Squares Equation

Suppose, we need to estimate the quantity demanded of a commodity ‘y’. According to economic theory, demand for any commodity depends on several factors such as its price, the price of other commodities, income, etc. For illustration purposes, we will also use three variables representing the price of y (denoted by ‘p’), the price of a substitute good (x) and income (i). In such a situation, therefore, the quantity demanded of y is the dependent variable. Its price, price of substitute good “x” and income (i) are independent or explanatory variables. Each observation of quantity demanded (y) can also be written in equation form as follows:

Ordinary Least Squares equation

Moreover, for different combinations of independent variables, the data has a corresponding value of ‘y’. For instance, if there are 50 observations, then we have 50 different values of ‘y’ corresponding to 50 different combinations of independent variables’ values.

OLS: The Matrix Form and Normal Equation Estimation

Generally, OLS estimation is carried out using matrices. The above equation can also be expressed in the form of matrices as follows:

Ordinary Least Squares Normal equation

Furthermore, the first column of X includes all values of 1 because of the constant. That is, it does not have any independent variable attached to it. Therefore, this column is necessary to estimate the constant.

Hence, the values of coefficients can be estimated using matrix multiplications, transpose and matrix inverse calculations. Further, this can be easily accomplished using any econometric software package.

Estimation with real data

To demonstrate, we will consider a small hypothetical data sample of 10 observations for the variables used above.

ObservationsQuantity demanded of commodity ‘y’Price of ‘y’ (p)Price of substitute good ‘x’Income (i)
 12324749811
 22473562769
 32903863906
 42464265803
 52394364761
 62653369774
 72454460814
 82273647735
 92374668775
 102733753853

The quantity demanded ‘y’ can be predicted from any given values of independent variables and their coefficients. This is accomplished by further substituting values in the following equation:

From the given data, we can also construct the matrices to estimate the coefficients with the normal equation method. Hence, the matrices for the above equation can be constructed as follows:

Ordinary Least Squares matrices

On solving, the values of coefficients based on the given data were estimated as:

Making predictions after ordinary least squares

Suppose, we also have to predict the quantity demanded of commodity ‘y’, given values of independent variables as p = 42, x = 57 and i = 780. Then, we can forecast ‘y’ using the OLS equation:

OLS predictions

Hence, this means that the quantity demanded of “y” will be 241 at the given price of 42, the price of substitute “x” at 57 and the income of 780.

Estimating the residuals or error terms

The residuals or error terms are further estimated by subtracting the “predicted values of Y” from the “actual values of Y” for every observation. Therefore, they can be expressed as:

Residuals

For each observation, we can also obtain the predicted values with the help of estimated coefficients and the values of independent variables from those observations. Then, we can estimate the residuals from the actual and predicted values.

ObservationsQuantity demanded of commodity ‘y’Price of ‘y’ (p)Price of substitute good ‘x’Income (i)Predicted ‘y’ (using coefficients)Residuals or error
(y – predicted y)
 123247498112320
 22473562769252-5
 32903863906291-1
 42464265803251-5
 523943647612345
 626533697742623
 72454460814247-2
 82273647735228-1
 923746687752361
 1027337538532694

Note: predicted ‘y’ has been further rounded off to the nearest integer values here. The quantity demanded of a commodity is generally a whole number because consumers cannot buy fractions of a commodity in most cases. For example, a consumer cannot buy 0.3 units of a book.


This website contains affiliate links. When you make a purchase through these links, we may earn a commission at no additional cost to you.


Related Posts

Leave a Reply