OLS or Ordinary Least Squares is one of the most common methods used in Econometrics. It is a linear regression technique that minimizes the sum of squared residuals (error term) to estimate the coefficients. In this post, we will dive further into the practical process of estimating the coefficients using Ordinary Least Squares.
The Ordinary Least Squares Equation
Suppose, we need to estimate the quantity demanded of a commodity ‘y’. According to economic theory, demand for any commodity depends on several factors such as its price, the price of other commodities, income, etc. For illustration purposes, we will also use three variables representing the price of y (denoted by ‘p’), the price of a substitute good (x) and income (i). In such a situation, therefore, the quantity demanded of y is the dependent variable. Its price, price of substitute good “x” and income (i) are independent or explanatory variables. Each observation of quantity demanded (y) can also be written in equation form as follows:
Moreover, for different combinations of independent variables, the data has a corresponding value of ‘y’. For instance, if there are 50 observations, then we have 50 different values of ‘y’ corresponding to 50 different combinations of independent variables’ values.
OLS: The Matrix Form and Normal Equation Estimation
Generally, OLS estimation is carried out using matrices. The above equation can also be expressed in the form of matrices as follows:
Furthermore, the first column of X includes all values of 1 because of the constant. That is, it does not have any independent variable attached to it. Therefore, this column is necessary to estimate the constant.
Hence, the values of coefficients can be estimated using matrix multiplications, transpose and matrix inverse calculations. Further, this can be easily accomplished using any econometric software package.
Estimation with real data
To demonstrate, we will consider a small hypothetical data sample of 10 observations for the variables used above.
Observations | Quantity demanded of commodity ‘y’ | Price of ‘y’ (p) | Price of substitute good ‘x’ | Income (i) |
1 | 232 | 47 | 49 | 811 |
2 | 247 | 35 | 62 | 769 |
3 | 290 | 38 | 63 | 906 |
4 | 246 | 42 | 65 | 803 |
5 | 239 | 43 | 64 | 761 |
6 | 265 | 33 | 69 | 774 |
7 | 245 | 44 | 60 | 814 |
8 | 227 | 36 | 47 | 735 |
9 | 237 | 46 | 68 | 775 |
10 | 273 | 37 | 53 | 853 |
The quantity demanded ‘y’ can be predicted from any given values of independent variables and their coefficients. This is accomplished by further substituting values in the following equation:
From the given data, we can also construct the matrices to estimate the coefficients with the normal equation method. Hence, the matrices for the above equation can be constructed as follows:
On solving, the values of coefficients based on the given data were estimated as:
Making predictions after ordinary least squares
Suppose, we also have to predict the quantity demanded of commodity ‘y’, given values of independent variables as p = 42, x = 57 and i = 780. Then, we can forecast ‘y’ using the OLS equation:
Hence, this means that the quantity demanded of “y” will be 241 at the given price of 42, the price of substitute “x” at 57 and the income of 780.
Estimating the residuals or error terms
The residuals or error terms are further estimated by subtracting the “predicted values of Y” from the “actual values of Y” for every observation. Therefore, they can be expressed as:
For each observation, we can also obtain the predicted values with the help of estimated coefficients and the values of independent variables from those observations. Then, we can estimate the residuals from the actual and predicted values.
Observations | Quantity demanded of commodity ‘y’ | Price of ‘y’ (p) | Price of substitute good ‘x’ | Income (i) | Predicted ‘y’ (using coefficients) | Residuals or error (y – predicted y) |
1 | 232 | 47 | 49 | 811 | 232 | 0 |
2 | 247 | 35 | 62 | 769 | 252 | -5 |
3 | 290 | 38 | 63 | 906 | 291 | -1 |
4 | 246 | 42 | 65 | 803 | 251 | -5 |
5 | 239 | 43 | 64 | 761 | 234 | 5 |
6 | 265 | 33 | 69 | 774 | 262 | 3 |
7 | 245 | 44 | 60 | 814 | 247 | -2 |
8 | 227 | 36 | 47 | 735 | 228 | -1 |
9 | 237 | 46 | 68 | 775 | 236 | 1 |
10 | 273 | 37 | 53 | 853 | 269 | 4 |
Note: predicted ‘y’ has been further rounded off to the nearest integer values here. The quantity demanded of a commodity is generally a whole number because consumers cannot buy fractions of a commodity in most cases. For example, a consumer cannot buy 0.3 units of a book.
Econometrics Tutorials
This website contains affiliate links. When you make a purchase through these links, we may earn a commission at no additional cost to you.