EXCEL 97: Two-Variable Regression using Excel functions
A. Colin Cameron, Dept. of Economics, Univ. of Calif.
- Davis
This September 1999 help sheet gives information on how to
-
Fit a regression line using Excel functions INTERCEPT, SLOPE, RSQ, STEYX
and FORECAST.
-
Fit a regression line using Excel function LINEST.
-
Obtain regression forecasts using Excel function TREND.
For most purposes these Excel functions are unnecessary.
It is easier to instead use Tools | Data Analysis | Regression Analysis
or to Add a trendline to a two-way scatterplot.
TWO VARIABLE REGRESSION
There are several ways to fit a regression line in Excel
For interpretation of regression output see Excel:
Two-Variable Regression using Tools | Data Analysis
INPUT DATA
The data used are in number of cars per household (Y) and household
size (X).
The spreadsheet cells A1:B6 should look like:
cars |
hh size |
1 |
1 |
2 |
2 |
2 |
3 |
2 |
4 |
3 |
5 |
EXCEL FUNCTIONS INTERCEPT, SLOPE, RSQ, STEYX and FORECAST
The model is y = b1 + b2*x + u, where the error term
u has mean 0 and variance sigma-squared.
The individual functions INTERCEPT, SLOPE, RSQ, STEYX and
FORECAST can be used to get key results for two-variable regression
-
INTERCEPT(A1:A6,B1:B6) yields the OLS intercept estimate of 0.8
-
SLOPE(A1:A6,B1:B6) yields the OLS slope estimate of 0.4
-
RSQ(A1:A6,B1:B6) yields the R-squared of 0.8
-
STEYX(A1:A6,B1:B6) yields the standard error of the regression of
0.36515 0.8
-
FORECAST(6,A1:A6,B1:B6) yields the OLS forecast value of Yhat=3.2
for X=6 (forecast 3.2 cars for household of size 6).
Thus the estimated model is
y = 0.8 + 0.4*x
with R-squared of 0.8 and estimated standard deviation of u of 0.36515
and we forecast that for x = 6 we have y = 0.8 + 0.4*6 = 3.2.
EXCEL FUNCTION LINEST
The individual function LINEST can be used to get regression
output similar to that
several forecasts from a two-variable regression.
This is tricky to use:
-
In cell A11 give the function LINEST(A2:A6,B2:B6,1,1).
Note that labels are not included when using function LINEST.
-
This gives only one value of 0.4 in cell A11. We need the other regression
output as well.
The output for two-variable regression is a 5 row by 2 column array.
-
Highlight cells A11:B15 and hit F2 key (then Edit appears at the bottom
of the screen).
Hit CTRL-SHIFT-ENTER.
This now gives values of 0.4 in cell A11, 0.8 in cell B11 and
-
3.6 in B12.
The output in cells A11:B15 is
0.4
0.8
0.11547
0.38297
0.8
0.36514
12.0
3.0
1.6
0.4
which represents
Slope coeff Intercept
coeff
St.error of slope St.error of intercept
R-squared
St.error of regression
F-test overall Degrees of freedom
(n-k)
Regression SS Residual SS
To get just the coefficients give the LINEST command with the last entry
0 rather than 1, ie. .LINEST(A2:A6,B2:B6,1,1),
and then highlight cells A21:B21, say, hit F2 key, and hit CTRL-SHIFT-ENTER.
LINEST can be extended to multiple regression (more than an intercept
and one regressor).
Then the first two rows of output are extended to number of columns
equals number of regressors (including constant)
and the last three rows are the same as earlier.
EXCEL FUNCTION TREND
The individual function TREND can be used to get several forecasts
from a two-variable regression.
This is tricky to use:
-
Set up the X values for the forecast, say 6 in cell A21 and 7 in cell A22.
-
In cell B21 give the function TREND(A2:A6,B2:B6,A21:A22,1).
Note that labels are not included when using function TREND.
This gives only one value of 3.2 in cell B21. We need the other forecast
as well.
-
Highlight cells B21:B22 and hit F2 key (then Edit appears at the bottom
of the screen).
Hit CTRL-SHIFT-ENTER.
This now gives values of 3.2 in cell B21 and 3.6 in B22.
Thus for X=6 we forecast Y=3.2, and for X=7 we forecast Y=3.6, as expected
given Y = 0.8 + 0.4*X.
TREND can be extended to multiple regression (more than an intercept
and one regressor).
For further information on how to use Excel go to
http://www.econ.ucdavis.edu/faculty/cameron