* AED08.do March 2015 For Stata version 12 

log using AED08.txt, text replace

********** OVERVIEW OF AED08.do **********

* STATA Program 
* copyright C 2015 by A. Colin Cameron
* Used for "Analyis of Economics Data: An Introduction to Econometrics"
* by A. Colin Cameron (2015) W.W. Norton

* To run you need file
*   AED_HOUSE.DTA
* in your directory

********** SETUP **********

set more off
version 12
clear all
set scheme s1manual  // Graphics scheme

************

* This STATA program does analysis for Chapter 8
*  8.1 EXAMPLE: HOUSE PRICE AND SIZE
*  8.2 TWO-WAY TABULATION
*  8.3 TWOWAY SCATTER PLOT
*  8.4 CORRELATION
*  8.5 REGRESSION LINE
*  8.6 R-SQUARED
*  8.7 REGRESSION AND CORRELATION
*  8.10 CAUSATION
*  8.11 NONPARAMETRIC REGRESSION
*  8.A APPENDIX: REGRESSION COMPUTATION

********** DATA DESCRIPTION

* House sale price for 29 houses in Central Davis in 1999
*     29 observations on 9 variables 

****  8.1 EXAMPLE: HOUSE PRICE AND SIZE

clear
use AED_HOUSE.DTA
summarize

* Table 8.1
sort price
list, clean 

* Table 8.2
summarize price, detail
mean price
summarize size, detail
mean size

****  8.2 TWO-WAY TABULATION

* Create categorical variables
generate pricerange = price
recode pricerange(1/249999=1) (250000/400000=2)
generate sizerange = size
recode sizerange (1/1799=1) (1800/2399=2) (2400/4000=3)

* Table 8.3
tabulate pricerange sizerange, row

* Table 8.4 - with expected frequencies
tabulate pricerange sizerange, expected

****  8.3 TWOWAY SCATTER PLOT

* Figure 8.1 - First panel
graph twoway (scatter price size)
graph export AED08FIG1A.wmf, replace 

****  8.4 CORRELATION

* Figure 8.1 - Second panel
* Simple version
scatter price size, xline(1883) yline(253910) 
graph export AED08FIG1B.wmf, replace 

* Covariance 
correlate price size, covariance
display %20.1f r(cov_12)

* Correlation coefficient
correlate price size

* Figure 8.2 - first panel as example
clear
set obs 30
set seed 12345
generate x = rnormal(3,1)
generate u = rnormal(0,0.8)
generate y = 3 + x + u 
correlate x y
scatter y x
graph export AED08FIG2A.wmf, replace 

****  8.5 REGRESSION

* Return to house price data
clear
use AED_HOUSE.DTA

* Linear regression
regress price size

* Figure 8.3
* This was hand drawn

* Figure 8.4 
graph twoway (scatter price size) (lfit price size)
graph export AED08FIG4.wmf, replace 

* Intercept-only regression compared to the sample mean
regress price
mean price 

****  8.6 R-SQUARED

* These data come from SAMPLEFIVE.DTA generated in chapter 9
* Then run regression to get residual
* Here round y to one decimal place.
clear 
input x y yhat 
  1  4.7  3.9
  2  4.7  4.9
  3  4.5  6.0
  4  7.4  7.0
  5  8.7  8.2
end

list
summarize 
* Shows y and yhat have mean 6
gen tss = (y - 6)^2
quietly sum tss
scalar TSS = r(sum)
di "Total sum of squares = " TSS
gen expss = (yhat - 6)^2
quietly sum expss
scalar EXPSS = r(sum)
di "Explained sum of squares = " EXPSS
di "R-squared = " EXPSS/TSS

* Figure 8.5 - two panels
scatter y x, yline(5.968) title("Total sum of squares")
graph export AED08FIG5A.wmf, replace 
scatter yhat x, yline(5.968) title("Explained sum of squares")
graph export AED08FIG5B.wmf, replace 

**** 8.8 COMPUTER OUTPUT FOLLOWING REGRESSION

* Table 8.5
clear
use AED_HOUSE.DTA
regress price size

* Intercept-only regression same as sample mean
regress price
mean price

****  8.10 CAUSATION

clear
use AED_HOUSE.DTA
* Reverse regression
regress size price

****  8.11 NONPARAMETRIC REGRESSION

clear
use AED_HOUSE.DTA
sort size
regress price size
predict ylinear

* Figure 8.6
* Shorten variable label for figure
label variable price "House price"
* lploy with chosen bandwidth
lpoly price size, degree(1) bw(300) generate(xlpoly ylpoly) 
* lowess with default bandwidth
lowess price size, generate(ylowess) 
graph twoway (scatter price size) (line ylpoly xlpoly) (line ylowess size), title("Nonparametric regression")
graph export AED08FIG6.wmf, replace 

********** CLOSE OUTPUT
log close


