. * creates output file . * replace here means existing file of same name will be overwritten . * and if this file is already open then give command log close . di "stpanel.do by Colin Cameron: Stata panel regression example" stpanel.do by Colin Cameron: Stata panel regression example . set maxvar 100 width 1000 (maxvar and maxobs no longer need be set with this version of Stata) . * If need more memory then in Stata give command help memory . . . ********** OVERVIEW OF STATA PANEL COMMANDS . * . * Stata has many panel commands. . . * For linear models the main commands are . * XTREG Classic fixed and random effects . * XTGLS GLS of constant coeff model for panels independent over i . * This is typical short panel such as NLSY, PSID, ... . * XTGEE GLS of constant coeff model for panels correlated over i . * This is typical macro such as cross-country and region . . * And other linear model commands include . * XTDATA permits exploratory analysis with the XTREG models . * XTHAUS permits test of fixed versus random effects . * XTREGAR does fixed and random effects with AR(1) error . * XTIVREG does IV estimation of fixed and random effects models . * XTABOND does IV estimation of fixed effects model with lagged dependent > regressors . * XTPCSE is a variant of XTGLS with simpler model estimated with correct s > tandard errors . * XTRCHH random coefficients model is only more complicated random effects > than random intercept . . * XTREG . * for model y_it = x_it'b + a_i + e_it where a_i is unbobserved individual > effect . * Options . * fe Fixed effects or within or LSDV . * be Between . * re Random effects with GLS estimates of error component variances . * mle Random effects with ML estimates of error component variances . * pa Population-averaged - see xtgee . . * XTGEE . * Also for some nonlinear models - see below. Default is linear. . * Linear model y_it = x_it'b + u_it where u_it is independent over i and T > is small . * Estimation is by WLS or GLS with different possible model for cor(u_it) . * Options . * robust gives robust standard errors that permit general cor(u_it,u_is) . * Thus b = (X'RX)-1 * X'Ry where R is working matrix defined by c > orr( ) . * V[b] = (X'RX)-1 without the robust option . * V[b] = (X'RX)-1 * X'RVR'X * (X'RX)-1 if the robust option is > used . * corr(independent) cor(u_it,u_is) = 0 t^=s (i.e. OLS) . * corr(exchangable) cor(u_it,u_is) = rho t^=s (i.e. same as random eff > ects - equicorrelation) . * This is the default . * corr(ar g) cor(u_it,u_is) = defined by an AR(g) model . * corr(stationary g) cor(u_it,u_is) = defined by an MA(g) model . * corr(nonstationary) cor(u_it,u_is) = rho_ts has an unrestricted stucture > d . * user-specified R is also possible . . * XTGLS . * Linear model y_it = x_it'b + u_it where u_it may be correlated over i and > n is small . * If n is large this command will need a lot of memory and take a long time . * and stacking y_i = x_i'b + u_i . * Estimation is by GLS and GLS standard errors are given . * Options . * panels defines the correlation over i . * panels(iid) cov(u_i,u_j) = 0 and var(u_i) = s^2 * I > . * so u_it uncorrelated over i with same variance over > i . * This is the default . * panels(hetero) cov(u_i,u_j) = 0 and var(u_i) = (s_i)^2 * I . * so u_it uncorrelated over i with different variance > over i . * panels(correlated) cov(u_i,u_j) = s_ij * I . * so u_it correlated over i with different variance an > d covariance over i . * this is basic seemingly unrelated regressions . * corr defines the correlation over t . * corr(independent) cor(u_it,u_is) = 0 . * so u_it uncorrelated over t . * This is the default . * corr(ar1) cor(u_it,u_is) = defined by AR(1) model with same rh > o for each i . * corr(psar1) cor(u_it,u_is) = defined by AR(1) model with differe > nt rho for each i . . * XTPCSE . * Same model as XTGLS . * Instead of GLS estimation use OLS or AR1 error estimator that has no correl > ation over i . * but then get correct standard errors allowing for possible correlation over > i . * (Stata calls this "panel correct standard errors" where correction is for c > orrelation across i) . * corr defines the correlation over t as in XTGLS. This is imposed in estimat > ion . * corr(independent) cor(u_it,u_is) = 0 this is the default . * corr(ar1) cor(u_it,u_is) = defined by AR(1) model with same rh > o for each i . * corr(psar1) cor(u_it,u_is) = defined by AR(1) model with differe > nt rho for each i . * and then to define the correlation over i as . * blank this is the default and has correlation over i and v > ariance varying with i . * hetonly no correlation over i and variance varying with i . * independent no correlation over i and same variance over i . . * For nonlinear models the data are always independent over i i.e. typical mi > cro cross-section . * XTGEE Does a range of models . * Other special commands exist for binbary, tobit and count models . . * XTGEE for nonlinear models . * Same as PA population averaged . * Can be applied to . * Normal various links or conditional mean functions . * Binomial various links or conditional mean functions including logit, prob > it, cloglog . * Count links or conditional mean functions include poisson and negative bin > omial . * Gamma . . * For the following special commands . * - sometimes there is fixed effects and sometimes there are not . * - sometimes random effects integrate out and sometimes instead gaussian qu > adrature is used . . * Binary Models . * XTLOGIT Logit FE, RE and PA . * XTPROBIT Probit RE and PA (no FE) . * XTCLOG Complementary log-log RE and PA (no FE) . . * Tobit Models . * XTTOBIT Tobit RE . * XTINTEREG Interval regression RE . . * Count Models . * XTPOISSON Poisson FE, RE and PA . * XTNBREG Negative binomial FE, RE and PA . . . ********** DATA DESCRIPTION . * . * The original data is from . * Bronwyn Hall, Zvi Griliches, and Jerry Hausman (1986), . * "Patents and R&D: Is There a Lag?", . * International Economic Review, 27, 265-283. . . * File patr7079.dat has data on 346 firms . * There are 4 lines per firm, with 25 variables . * Time-invariant: CUSIP,ARDSSIC,SCISECT,LOGK,SUMPAT, . * Time-varying X: LOGR70,LOGR71,LOGR72, ....., LOGR77,LOGR78,LOGR79 . * Time-varying Y: PAT70,PAT71,PAT72, ....., PAT77,PAT78,PAT79 . * in the format: . * I7,I3,I2,5F12.6/6F12.6/6F12.6/5F12.6/ . * where . * CUSIP Compustat's identifying number for the firm (Committee on . * Uniform Security Identification Procedures number). . * ARDSIC A two-digit code for the applied R&D industrial classification . * (roughly that in Bound, Cummins, Griliches, Hall, and Jaffe, in . * the Griliches R&D, Patents, and Productivity volume). . * SCISECT Dummy equal to one for firms in the scientific sector. . * LOGK The logarithm of the book value of capital in 1972. . * SUMPAT The sum of patents applied for between 1972-1979. . * LOGR70- The logarithm of R&D spending during the year (in 1972 dollars). . * LOGR79 . * PAT70- The number of patents applied for during the year that were . * PAT79 eventually granted. . . . ********** READ DATA . * . * The data are in ascii file patr7079.dat . * There are 346 observations on 25 variables with four lines per obs . * The data are fixed format with . * line 1 variables 1-8 I7,I3,I2,5F12.6 . * line 2 variables 9-14 6F12.6 . * line 3 variables 15-20 6F12.6 . * line 4 variables 20-25 6F12.6 . . * Read in using Infile: FREE FORMAT WITHOUT DICTIONARY . * As there is space between each observation data is also space-delimited . * free format and then there is no need for a dictionary file . * The following command spans more that one line so use /* and */ . infile CUSIP ARDSSIC SCISECT LOGK SUMPAT LOGR70 LOGR71 LOGR72 LOGR73 /* > */ LOGR74 LOGR75 LOGR76 LOGR77 LOGR78 LOGR79 PAT70 PAT71 PAT72 /* > */ PAT73 PAT74 PAT75 PAT76 PAT77 PAT78 PAT79 using patr7079.asc (346 observations read) . * To drop off extra blanks (if any) at end of file jaggia.asc . drop if _n>347 (0 observations deleted) . . . ********** DATA TRANSFORMATIONS AND CHECK . * Use observation number as an identifier, not just CUSIP . gen id = _n . label variable id "id" . * The following lists the variables in data set and summarizes data . describe Contains data obs: 346 vars: 26 size: 37,368 (96.3% of memory free) ------------------------------------------------------------------------------- 1. CUSIP float %9.0g 2. ARDSSIC float %9.0g 3. SCISECT float %9.0g 4. LOGK float %9.0g 5. SUMPAT float %9.0g 6. LOGR70 float %9.0g 7. LOGR71 float %9.0g 8. LOGR72 float %9.0g 9. LOGR73 float %9.0g 10. LOGR74 float %9.0g 11. LOGR75 float %9.0g 12. LOGR76 float %9.0g 13. LOGR77 float %9.0g 14. LOGR78 float %9.0g 15. LOGR79 float %9.0g 16. PAT70 float %9.0g 17. PAT71 float %9.0g 18. PAT72 float %9.0g 19. PAT73 float %9.0g 20. PAT74 float %9.0g 21. PAT75 float %9.0g 22. PAT76 float %9.0g 23. PAT77 float %9.0g 24. PAT78 float %9.0g 25. PAT79 float %9.0g 26. id float %9.0g id ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- CUSIP | 346 531201.2 282074.9 800 989399 ARDSSIC | 336 9.97619 5.459706 1 21 SCISECT | 346 .4248555 .4950369 0 1 LOGK | 346 3.921063 2.095542 -1.76965 9.66626 SUMPAT | 346 284.7312 571.1136 0 3806 LOGR70 | 346 1.198348 1.941968 -3.67354 6.56641 LOGR71 | 346 1.169182 1.929444 -3.53055 6.95687 LOGR72 | 346 1.185953 1.929078 -3.35241 6.97009 LOGR73 | 346 1.231135 1.934896 -3.67395 7.06211 LOGR74 | 346 1.232636 1.946417 -3.15274 7.06524 LOGR75 | 346 1.165802 1.98001 -3.5476 6.76486 LOGR76 | 346 1.212888 1.979273 -3.84868 6.8285 LOGR77 | 346 1.250034 2.003002 -3.47884 6.90253 LOGR78 | 346 1.306511 2.019792 -3.2832 6.96345 LOGR79 | 346 1.345581 2.054982 -3.57742 7.03432 PAT70 | 346 40.00289 82.50335 0 608 PAT71 | 346 38.10983 78.40308 0 553 PAT72 | 346 36.30925 74.81591 0 557 PAT73 | 346 36.95376 77.91971 0 595 PAT74 | 346 37.60983 75.94388 0 528 PAT75 | 346 36.87283 75.98788 0 508 PAT76 | 346 35.84682 73.31613 0 487 PAT77 | 346 36.23121 72.75146 0 456 PAT78 | 346 32.80636 65.6505 0 434 PAT79 | 346 32.10116 66.36197 0 515 id | 346 173.5 100.0258 1 346 . . . ******** CHANGE ORGANIZATION OF DATA USING RESHAPE AND MORE TRANSFORMATIONS . * . reshape long PAT LOGR, i(id) j(year) (note: j = 70 71 72 73 74 75 76 77 78 79) Data wide -> long ----------------------------------------------------------------------------- Number of obs. 346 -> 3460 Number of variables 26 -> 9 j variable (10 values) -> year xij variables: PAT70 PAT71 ... PAT79 -> PAT LOGR70 LOGR71 ... LOGR79 -> LOGR ----------------------------------------------------------------------------- . * . describe Contains data obs: 3,460 vars: 9 size: 128,020 (84.3% of memory free) ------------------------------------------------------------------------------- 1. id float %9.0g id 2. year byte %9.0g 3. CUSIP float %9.0g 4. ARDSSIC float %9.0g 5. SCISECT float %9.0g 6. LOGK float %9.0g 7. SUMPAT float %9.0g 8. PAT float %9.0g 9. LOGR float %9.0g ------------------------------------------------------------------------------- Sorted by: id year Note: dataset has changed since last saved . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- id | 3460 173.5 99.89562 1 346 year | 3460 74.5 2.872696 70 79 CUSIP | 3460 531201.2 281707.7 800 989399 ARDSSIC | 3360 9.97619 5.452387 1 21 SCISECT | 3460 .4248555 .4943925 0 1 LOGK | 3460 3.921063 2.092814 -1.76965 9.66626 SUMPAT | 3460 284.7312 570.3701 0 3806 PAT | 3460 36.28439 74.46563 0 608 LOGR | 3460 1.229807 1.970524 -3.84868 7.06524 . . * Create new variable log(patents) with adjustment for patents = 0 . gen NEWPAT = PAT . replace NEWPAT = 0.5 if NEWPAT==0. (605 real changes made) . gen LPAT = ln(NEWPAT) . label variable LOGR "Ln(R&D)" . label variable LPAT "Ln(Patents)" . label variable PAT "Patents" . . * Create OLS residuals from regress LPAT on LOGR . regress LPAT LOGR Source | SS df MS Number of obs = 3460 ---------+------------------------------ F( 1, 3458) = 9163.58 Model | 9543.62221 1 9543.62221 Prob > F = 0.0000 Residual | 3601.41249 3458 1.04147267 R-squared = 0.7260 ---------+------------------------------ Adj R-squared = 0.7259 Total | 13145.0347 3459 3.80024131 Root MSE = 1.0205 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8429456 .0088058 95.727 0.000 .8256806 .8602106 _cons | .8988033 .0204519 43.947 0.000 .8587043 .9389022 ------------------------------------------------------------------------------ . predict uols, residuals . . * Check data and Save data as Stata data set . describe Contains data obs: 3,460 vars: 12 size: 169,540 (79.4% of memory free) ------------------------------------------------------------------------------- 1. id float %9.0g id 2. year byte %9.0g 3. CUSIP float %9.0g 4. ARDSSIC float %9.0g 5. SCISECT float %9.0g 6. LOGK float %9.0g 7. SUMPAT float %9.0g 8. PAT float %9.0g Patents 9. LOGR float %9.0g Ln(R&D) 10. NEWPAT float %9.0g 11. LPAT float %9.0g Ln(Patents) 12. uols float %9.0g Residuals ------------------------------------------------------------------------------- Sorted by: id year Note: dataset has changed since last saved . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- id | 3460 173.5 99.89562 1 346 year | 3460 74.5 2.872696 70 79 CUSIP | 3460 531201.2 281707.7 800 989399 ARDSSIC | 3360 9.97619 5.452387 1 21 SCISECT | 3460 .4248555 .4943925 0 1 LOGK | 3460 3.921063 2.092814 -1.76965 9.66626 SUMPAT | 3460 284.7312 570.3701 0 3806 PAT | 3460 36.28439 74.46563 0 608 LOGR | 3460 1.229807 1.970524 -3.84868 7.06524 NEWPAT | 3460 36.37182 74.42325 .5 608 LPAT | 3460 1.935464 1.949421 -.6931472 6.410175 uols | 3460 -2.77e-10 1.020378 -3.400903 2.814375 . drop NEWPAT . save patr7079, replace file patr7079.dta saved . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- id | 3460 173.5 99.89562 1 346 year | 3460 74.5 2.872696 70 79 CUSIP | 3460 531201.2 281707.7 800 989399 ARDSSIC | 3360 9.97619 5.452387 1 21 SCISECT | 3460 .4248555 .4943925 0 1 LOGK | 3460 3.921063 2.092814 -1.76965 9.66626 SUMPAT | 3460 284.7312 570.3701 0 3806 PAT | 3460 36.28439 74.46563 0 608 LOGR | 3460 1.229807 1.970524 -3.84868 7.06524 LPAT | 3460 1.935464 1.949421 -.6931472 6.410175 uols | 3460 -2.77e-10 1.020378 -3.400903 2.814375 . xtsum, i(id) Variable | Mean Std. Dev. Min Max | Observations -----------------+--------------------------------------------+---------------- id overall | 173.5 99.89562 1 346 | N = 3460 between | 100.0258 1 346 | n = 346 within | 0 173.5 173.5 | T = 10 | | year overall | 74.5 2.872696 70 79 | N = 3460 between | 0 74.5 74.5 | n = 346 within | 2.872696 70 79 | T = 10 | | CUSIP overall | 531201.2 281707.7 800 989399 | N = 3460 between | 282074.9 800 989399 | n = 346 within | 0 531201.2 531201.2 | T = 10 | | ARDSSIC overall | 9.97619 5.452387 1 21 | N = 3360 between | 5.459706 1 21 | n = 336 within | 0 9.97619 9.97619 | T = 10 | | SCISECT overall | .4248555 .4943925 0 1 | N = 3460 between | .4950369 0 1 | n = 346 within | 0 .4248555 .4248555 | T = 10 | | LOGK overall | 3.921063 2.092814 -1.76965 9.66626 | N = 3460 between | 2.095542 -1.76965 9.66626 | n = 346 within | 0 3.921063 3.921063 | T = 10 | | SUMPAT overall | 284.7312 570.3701 0 3806 | N = 3460 between | 571.1136 0 3806 | n = 346 within | 0 284.7312 284.7312 | T = 10 | | PAT overall | 36.28439 74.46563 0 608 | N = 3460 between | 72.5989 0 484.8 | n = 346 within | 16.97772 -177.7156 224.3844 | T = 10 | | LOGR overall | 1.229807 1.970524 -3.84868 7.06524 | N = 3460 between | 1.944421 -3.120133 6.911438 | n = 346 within | .3347099 -1.19673 4.218814 | T = 10 | | LPAT overall | 1.935464 1.949421 -.6931472 6.410175 | N = 3460 between | 1.873181 -.6931472 6.180623 | n = 346 within | .5482375 -.2643028 4.368045 | T = 10 | | uols overall | -2.77e-10 1.020378 -3.400903 2.814375 | N = 3460 between | .8406646 -2.427556 2.090227 | n = 346 within | .5799081 -2.365188 2.365571 | T = 10 . . . ******** LOOK AT DATA AGAIN IN ORIGINAL FORM USING RESHAPE . . reshape wide PAT LOGR LPAT uols, i(id) j(year) (note: j = 70 71 72 73 74 75 76 77 78 79) Data long -> wide ----------------------------------------------------------------------------- Number of obs. 3460 -> 346 Number of variables 11 -> 46 j variable (10 values) year -> (dropped) xij variables: PAT -> PAT70 PAT71 ... PAT79 LOGR -> LOGR70 LOGR71 ... LOGR79 LPAT -> LPAT70 LPAT71 ... LPAT79 uols -> uols70 uols71 ... uols79 ----------------------------------------------------------------------------- . . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- id | 346 173.5 100.0258 1 346 PAT70 | 346 40.00289 82.50335 0 608 LOGR70 | 346 1.198348 1.941968 -3.67354 6.56641 LPAT70 | 346 2.112845 1.904053 -.6931472 6.410175 uols70 | 346 .2038998 .9604677 -2.990364 2.814375 PAT71 | 346 38.10983 78.40308 0 553 LOGR71 | 346 1.169182 1.929444 -3.53055 6.95687 LPAT71 | 346 2.038442 1.928317 -.6931472 6.315358 uols71 | 346 .1540819 1.023832 -2.755215 2.426553 PAT72 | 346 36.30925 74.81591 0 557 LOGR72 | 346 1.185953 1.929078 -3.35241 6.97009 LPAT72 | 346 2.033894 1.885582 -.6931472 6.322565 uols72 | 346 .1353962 .9739384 -3.246577 2.62656 PAT73 | 346 36.95376 77.91971 0 595 LOGR73 | 346 1.231135 1.934896 -3.67395 7.06211 LPAT73 | 346 1.957524 1.945464 -.6931472 6.388561 uols73 | 346 .0209412 .9827114 -2.423676 2.708687 PAT74 | 346 37.60983 75.94388 0 528 LOGR74 | 346 1.232636 1.946417 -3.15274 7.06524 LPAT74 | 346 1.995871 1.950387 -.6931472 6.269096 uols74 | 346 .0580225 .9582284 -2.786935 2.44942 PAT75 | 346 36.87283 75.98788 0 508 LOGR75 | 346 1.165802 1.98001 -3.5476 6.76486 LPAT75 | 346 1.903179 1.980945 -.6931472 6.230482 uols75 | 346 .0216683 .9973142 -2.743793 2.263599 PAT76 | 346 35.84682 73.31613 0 487 LOGR76 | 346 1.212888 1.979273 -3.84868 6.8285 LPAT76 | 346 1.909246 1.944694 -.6931472 6.188264 uols76 | 346 -.0119562 .9925167 -3.201614 2.345425 PAT77 | 346 36.23121 72.75146 0 456 LOGR77 | 346 1.250034 2.003002 -3.47884 6.90253 LPAT77 | 346 1.858655 2.003905 -.6931472 6.122493 uols77 | 346 -.0938587 1.063898 -3.309486 2.417753 PAT78 | 346 32.80636 65.6505 0 434 LOGR78 | 346 1.306511 2.019792 -3.2832 6.96345 LPAT78 | 346 1.822064 1.955754 -.6931472 6.073044 uols78 | 346 -.1780566 1.059863 -3.400903 2.512492 PAT79 | 346 32.10116 66.36197 0 515 LOGR79 | 346 1.345581 2.054982 -3.57742 7.03432 LPAT79 | 346 1.722916 1.986193 -.6931472 6.244167 uols79 | 346 -.3101384 1.08413 -3.245464 2.606471 CUSIP | 346 531201.2 282074.9 800 989399 ARDSSIC | 336 9.97619 5.459706 1 21 SCISECT | 346 .4248555 .4950369 0 1 LOGK | 346 3.921063 2.095542 -1.76965 9.66626 SUMPAT | 346 284.7312 571.1136 0 3806 . correlate LOGR70 LOGR71 LOGR72 LOGR73 LOGR74 LOGR75 LOGR76 LOGR77 LOGR78 LOGR > 79 (obs=346) | LOGR70 LOGR71 LOGR72 LOGR73 LOGR74 LOGR75 LOGR76 ---------+--------------------------------------------------------------- LOGR70 | 1.0000 LOGR71 | 0.9739 1.0000 LOGR72 | 0.9697 0.9858 1.0000 LOGR73 | 0.9625 0.9790 0.9885 1.0000 LOGR74 | 0.9576 0.9740 0.9793 0.9900 1.0000 LOGR75 | 0.9499 0.9669 0.9720 0.9845 0.9922 1.0000 LOGR76 | 0.9445 0.9601 0.9654 0.9781 0.9856 0.9928 1.0000 LOGR77 | 0.9400 0.9542 0.9601 0.9725 0.9803 0.9877 0.9925 LOGR78 | 0.9325 0.9475 0.9517 0.9659 0.9758 0.9838 0.9860 LOGR79 | 0.9180 0.9353 0.9381 0.9552 0.9641 0.9727 0.9770 | LOGR77 LOGR78 LOGR79 ---------+--------------------------- LOGR77 | 1.0000 LOGR78 | 0.9943 1.0000 LOGR79 | 0.9860 0.9931 1.0000 . correlate LPAT70 LPAT71 LPAT72 LPAT73 LPAT74 LPAT75 LPAT76 LPAT77 LPAT78 LPAT > 79 (obs=346) | LPAT70 LPAT71 LPAT72 LPAT73 LPAT74 LPAT75 LPAT76 ---------+--------------------------------------------------------------- LPAT70 | 1.0000 LPAT71 | 0.9336 1.0000 LPAT72 | 0.9194 0.9270 1.0000 LPAT73 | 0.9184 0.9324 0.9230 1.0000 LPAT74 | 0.9147 0.9120 0.9226 0.9279 1.0000 LPAT75 | 0.8967 0.9003 0.8954 0.9228 0.9352 1.0000 LPAT76 | 0.8992 0.8908 0.8925 0.9158 0.9321 0.9383 1.0000 LPAT77 | 0.8987 0.8918 0.8966 0.9032 0.9163 0.9341 0.9371 LPAT78 | 0.9063 0.9006 0.8961 0.9090 0.9258 0.9231 0.9259 LPAT79 | 0.8888 0.8867 0.8949 0.9068 0.9259 0.9221 0.9285 | LPAT77 LPAT78 LPAT79 ---------+--------------------------- LPAT77 | 1.0000 LPAT78 | 0.9440 1.0000 LPAT79 | 0.9423 0.9470 1.0000 . correlate uols70 uols71 uols72 uols73 uols74 uols75 uols76 uols77 uols78 uols > 79 (obs=346) | uols70 uols71 uols72 uols73 uols74 uols75 uols76 ---------+--------------------------------------------------------------- uols70 | 1.0000 uols71 | 0.7430 1.0000 uols72 | 0.6846 0.7131 1.0000 uols73 | 0.6624 0.7200 0.6779 1.0000 uols74 | 0.6737 0.6514 0.6725 0.6923 1.0000 uols75 | 0.6098 0.6157 0.5642 0.6749 0.7304 1.0000 uols76 | 0.5974 0.5557 0.5442 0.6251 0.6993 0.7370 1.0000 uols77 | 0.6118 0.5672 0.5859 0.6035 0.6491 0.7293 0.7732 uols78 | 0.6400 0.6134 0.5771 0.6243 0.6780 0.7034 0.7388 uols79 | 0.5546 0.5565 0.5573 0.5996 0.6574 0.6633 0.7307 | uols77 uols78 uols79 ---------+--------------------------- uols77 | 1.0000 uols78 | 0.7992 1.0000 uols79 | 0.7717 0.8005 1.0000 . corr LOGR70 LOGR71 LOGR72 LOGR73 LOGR74 LOGR75 LOGR76 LOGR77 LOGR78 LOGR79, c > ov (obs=346) | LOGR70 LOGR71 LOGR72 LOGR73 LOGR74 LOGR75 LOGR76 ---------+--------------------------------------------------------------- LOGR70 | 3.77124 LOGR71 | 3.64913 3.72275 LOGR72 | 3.63283 3.66927 3.72134 LOGR73 | 3.61657 3.65472 3.68967 3.74382 LOGR74 | 3.61953 3.65803 3.67708 3.72834 3.78854 LOGR75 | 3.65264 3.6937 3.71249 3.7716 3.82398 3.92044 LOGR76 | 3.63047 3.6665 3.68594 3.74568 3.79696 3.89091 3.91752 LOGR77 | 3.65636 3.68762 3.70965 3.76915 3.822 3.91731 3.93472 LOGR78 | 3.65767 3.69257 3.70831 3.77496 3.83617 3.93433 3.94176 LOGR79 | 3.66361 3.70848 3.7188 3.79819 3.85624 3.9579 3.97376 | LOGR77 LOGR78 LOGR79 ---------+--------------------------- LOGR77 | 4.01202 LOGR78 | 4.02249 4.07956 LOGR79 | 4.05856 4.12187 4.22295 . corr LPAT70 LPAT71 LPAT72 LPAT73 LPAT74 LPAT75 LPAT76 LPAT77 LPAT78 LPAT79, c > ov (obs=346) | LPAT70 LPAT71 LPAT72 LPAT73 LPAT74 LPAT75 LPAT76 ---------+--------------------------------------------------------------- LPAT70 | 3.62542 LPAT71 | 3.42766 3.71841 LPAT72 | 3.30077 3.37068 3.55542 LPAT73 | 3.40193 3.4977 3.38596 3.78483 LPAT74 | 3.39695 3.42993 3.39306 3.52084 3.80401 LPAT75 | 3.38226 3.43906 3.34448 3.55628 3.61338 3.92414 LPAT76 | 3.32971 3.34055 3.27286 3.46477 3.53553 3.61458 3.78184 LPAT77 | 3.42902 3.44597 3.38769 3.52126 3.58121 3.70784 3.65203 LPAT78 | 3.37508 3.39639 3.30461 3.45871 3.53151 3.57622 3.52144 LPAT79 | 3.3613 3.3959 3.35138 3.50404 3.5867 3.62798 3.58651 | LPAT77 LPAT78 LPAT79 ---------+--------------------------- LPAT77 | 4.01564 LPAT78 | 3.69953 3.82498 LPAT79 | 3.75052 3.67869 3.94496 . corr uols70 uols71 uols72 uols73 uols74 uols75 uols76 uols77 uols78 uols79, c > ov (obs=346) | uols70 uols71 uols72 uols73 uols74 uols75 uols76 ---------+--------------------------------------------------------------- uols70 | .922498 uols71 | .730646 1.04823 uols72 | .640368 .711053 .948556 uols73 | .625208 .724381 .648858 .965722 uols74 | .620033 .639082 .627633 .651934 .918202 uols75 | .584139 .628663 .548069 .661464 .69804 .994636 uols76 | .56946 .564637 .526075 .609677 .665059 .729528 .985089 uols77 | .625143 .617873 .607078 .630927 .661711 .773868 .816422 uols78 | .65154 .665618 .595684 .650282 .688582 .743492 .777204 uols79 | .577476 .617691 .588469 .638759 .682971 .717139 .786204 | uols77 uols78 uols79 ---------+--------------------------- uols77 | 1.13188 uols78 | .901132 1.12331 uols79 | .890068 .919781 1.17534 . . . ********* XTDATA: LINEAR PANEL - SPECIFICATION SEARCH . . * XTDATA permits plots of between data, within data and overall data . * Useful for looginf at the data. See Stata manual under xtdata for example. . . * iis is an xt command that defines the variable for the ith individual . * tis is an xt command that defines the variable for the tth year . . * Here only individual specific effects are considered. So do not use tis . . * For plotting we can use . * ksm kernel smoothing using lowess local regression li > ne . * graph with c(s) option median bands using smoothing spline . * The latter is quicker so I use that . . * Overall plot of data . use patr7079, clear . graph LPAT LOGR, xlab ylab s(p) c(s) bands(20) saving (stpantot, replace) /* > */ title("Overall regression: Ln(Patents) on LOG(R&D)") . * OLS regression gives wrong standard errors as no attempt to control for clu > stering . regress LPAT LOGR Source | SS df MS Number of obs = 3460 ---------+------------------------------ F( 1, 3458) = 9163.58 Model | 9543.62221 1 9543.62221 Prob > F = 0.0000 Residual | 3601.41249 3458 1.04147267 R-squared = 0.7260 ---------+------------------------------ Adj R-squared = 0.7259 Total | 13145.0347 3459 3.80024131 Root MSE = 1.0205 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8429456 .0088058 95.727 0.000 .8256806 .8602106 _cons | .8988033 .0204519 43.947 0.000 .8587043 .9389022 ------------------------------------------------------------------------------ . * ksm LPAT LOGR, lowess xlab ylab s(.) c(s) saving (stpantot, replace) . * gphprint . . * Within plot of data . use patr7079, clear . iis id . xtdata, fe . graph LPAT LOGR, xlab ylab s(p) c(s) bands(20) saving (stpanre, replace) /* > */ title("Within (fixed effects) regression: Ln(Patents) on LOG(R&D)") . regress LPAT LOGR Source | SS df MS Number of obs = 3360 ---------+------------------------------ F( 1, 3358) = 60.07 Model | 17.5209736 1 17.5209736 Prob > F = 0.0000 Residual | 979.446416 3358 .291675526 R-squared = 0.0176 ---------+------------------------------ Adj R-squared = 0.0173 Total | 996.967389 3359 .29680482 Root MSE = .54007 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .2175953 .028075 7.750 0.000 .1625494 .2726412 _cons | 1.668886 .03561 46.866 0.000 1.599067 1.738706 ------------------------------------------------------------------------------ . * ksm LPAT LOGR, lowess xlab ylab s(.) c(s) saving (stpanre, replace) . * gphprint . . * Betweeen plot of data with lowess local regression line . use patr7079, clear . iis id . xtdata, be . graph LPAT LOGR, xlab ylab s(p) c(s) bands(20) saving (stpanbe, replace) /* > */ title("Between: Ln(PATENTS) on LOG(R&D)") . regress LPAT LOGR Source | SS df MS Number of obs = 336 ---------+------------------------------ F( 1, 334) = 1327.83 Model | 944.140058 1 944.140058 Prob > F = 0.0000 Residual | 237.487538 334 .711040534 R-squared = 0.7990 ---------+------------------------------ Adj R-squared = 0.7984 Total | 1181.6276 335 3.52724655 Root MSE = .84323 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8585752 .0235617 36.439 0.000 .8122271 .9049233 _cons | .8841982 .0542972 16.284 0.000 .7773906 .9910058 ------------------------------------------------------------------------------ . * ksm LPAT LOGR, lowess xlab ylab s(.) c(s) saving (stpanbe, replace) . * gphprint . . * plot a graph again . * graph using myexampl\hrvslnw . * To print the graph: . * gphdot myexampl\hrvslnw for default medium resolution . * gphdot myexampl\hrvslnw /dhpl for low resolution . * gphdot myexampl\hrvslnw /dhplphr for high resolution . . . ********** XTREG: LINEAR PANEL - CLASSIC RANDOM AND FIXED EFFECTS . * . * Note that in the first xt command need to give , i(id) . * to indicate that the ith observation is for the ith id . * Time invariant regressors LOGK SCISECT are not included . use patr7079, clear . * . * Fixed effects . xtreg LPAT LOGR, fe i(id) Fixed-effects (within) regression Number of obs = 3460 Group variable (i) : id Number of groups = 346 R-sq: within = 0.0201 Obs per group: min = 10 between = 0.7989 avg = 10.0 overall = 0.7260 max = 10 F(1,3113) = 63.90 corr(u_i, Xb) = 0.8123 Prob > F = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .2323019 .0290602 7.994 0.000 .1753228 .289281 _cons | 1.649777 .037038 44.543 0.000 1.577156 1.722399 ------------------------------------------------------------------------------ sigma_u | 1.4833342 sigma_e | .5720608 rho | .87052456 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(345,3113) = 22.88 Prob > F = 0.0000 . predict yfe, xbu . estimates list scalars: e(r2) = .0201143103387619 e(rmse) = .5720607953303093 e(mss) = 20.91188697893154 e(r2_a) = -.0887968520842348 e(ll) = -2794.253361353778 e(ll_0) = -2829.405849523752 e(rss) = 1018.740312213434 e(df_m) = 346 e(df_r) = 3113 e(tss) = 13145.03470090303 e(N) = 3460 e(df_b) = 1 e(r2_w) = .0201143103387619 e(df_a) = 345 e(F) = 63.90117617312389 e(F_f) = 22.8752483403756 e(Tbar) = 10 e(Tcon) = 1 e(g_min) = 10 e(g_avg) = 10 e(rho) = .8705245632899523 e(sigma) = 1.589821966433551 e(sigma_e) = .5720607953303093 e(r2_b) = .7989422692622233 e(r2_o) = .7260248778070558 e(corr) = .8122624817238637 e(sigma_u) = 1.483334194104854 e(ui) = 1.483334194104854 e(N_g) = 346 e(g_max) = 10 macros: e(cmd) : "xtreg" e(predict) : "xtrefe_p" e(model) : "fe" e(depvar) : "LPAT" e(ivar) : "id" matrices: e(b) : 1 x 2 e(V) : 2 x 2 functions: e(sample) . gen ufe = LPAT - yfe . . * Random effects . xtreg LPAT LOGR, re i(id) Random-effects GLS regression Number of obs = 3460 Group variable (i) : id Number of groups = 346 R-sq: within = 0.0201 Obs per group: min = 10 between = 0.7989 avg = 10.0 overall = 0.7260 max = 10 Random effects u_i ~ Gaussian Wald chi2(1) = 1058.80 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .6151684 .0189054 32.539 0.000 .5781144 .6522223 _cons | 1.178925 .052473 22.467 0.000 1.07608 1.28177 ---------+-------------------------------------------------------------------- sigma_u | .82146096 sigma_e | .5720608 rho | .67341649 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . estimates list scalars: e(sigma_u) = .821460963326105 e(sigma_e) = .5720607953303093 e(sigma) = 1.001025308282762 e(rho) = .6734164873303894 e(N) = 3460 e(Tbar) = 10 e(Tcon) = 1 e(N_g) = 346 e(df_m) = 1 e(chi2) = 1058.802111492624 e(g_min) = 10 e(g_avg) = 10 e(g_max) = 10 e(theta) = .784933998771121 e(r2_o) = .7260248778070603 e(r2_b) = .7989422692622236 e(r2_w) = .0201143103387595 macros: e(cmd) : "xtreg" e(predict) : "xtrere_p" e(model) : "re" e(ivar) : "id" e(chi2type) : "Wald" e(depvar) : "LPAT" matrices: e(b) : 1 x 2 e(V) : 2 x 2 e(VCEf) : 2 x 2 e(bf) : 1 x 2 functions: e(sample) . . * Hausman test of fixed versus random effects . xthaus Hausman specification test ---- Coefficients ---- | Fixed Random LPAT | Effects Effects Difference ---------+----------------------------------------- LOGR | .2323019 .6151684 -.3828665 Test: Ho: difference in coefficients not systematic chi2( 1) = (b-B)'[S^(-1)](b-B), S = (S_fe - S_re) = 300.95 Prob>chi2 = 0.0000 . . * Between . xtreg LPAT LOGR, be i(id) Between regression (regression on group means) Number of obs = 3460 Group variable (i) : id Number of groups = 346 R-sq: within = 0.0201 Obs per group: min = 10 between = 0.7989 avg = 10.0 overall = 0.7260 max = 10 F(1,344) = 1366.95 sd(u_i + avg(e_i.))= .8411441 Prob > F = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8610872 .0232901 36.972 0.000 .8152784 .9068961 _cons | .8764926 .053528 16.374 0.000 .7712092 .9817759 ------------------------------------------------------------------------------ . estimates list scalars: e(df_m) = 1 e(df_r) = 344 e(F) = 1366.951370721744 e(r2) = .7989422692622247 e(rmse) = .8411441431907178 e(mss) = 967.1501766203948 e(rss) = 243.3880735506721 e(r2_a) = .7983577991147311 e(ll) = -430.0945165389776 e(ll_0) = -707.6147491980769 e(N_g) = 346 e(N) = 3460 e(Tbar) = 10 e(Tcon) = 1 e(r2_b) = .7989422692622247 e(r2_o) = .7260248778070552 e(r2_w) = .0201143103387598 e(g_min) = 10 e(g_avg) = 10 e(g_max) = 10 macros: e(cmd) : "xtreg" e(predict) : "xtrefe_p" e(model) : "be" e(ivar) : "id" e(depvar) : "LPAT" matrices: e(b) : 1 x 2 e(V) : 2 x 2 functions: e(sample) . . * Random effects MLE will be slightly different from re . xtreg LPAT LOGR, mle i(id) Fitting constant-only model: Iteration 0: log likelihood = -4234.9549 Iteration 1: log likelihood = -3969.1508 Iteration 2: log likelihood = -3860.6634 Iteration 3: log likelihood = -3824.3714 Iteration 4: log likelihood = -3816.9936 Iteration 5: log likelihood = -3816.4774 Iteration 6: log likelihood = -3816.4736 Fitting full model: Iteration 0: log likelihood = -3714.146 Iteration 1: log likelihood = -3645.7607 Iteration 2: log likelihood = -3638.7445 Iteration 3: log likelihood = -3638.1592 Iteration 4: log likelihood = -3638.1528 Random-effects ML regression Number of obs = 3460 Group variable (i) : id Number of groups = 346 Random effects u_i ~ Gaussian Obs per group: min = 10 avg = 10.0 max = 10 LR chi2(1) = 356.64 Log likelihood = -3638.1528 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .5618991 .0259239 21.675 0.000 .5110891 .6127091 _cons | 1.244436 .063438 19.617 0.000 1.1201 1.368772 ---------+-------------------------------------------------------------------- /sigma_u | 1.003404 .0490914 20.440 0.000 .9071865 1.099621 /sigma_e | .583667 .007617 76.627 0.000 .568738 .598596 ---------+-------------------------------------------------------------------- rho | .7471832 .0198363 .7067892 .7844396 ------------------------------------------------------------------------------ Likelihood ratio test of sigma_u=0: chi2(1) = 2681.35 Prob > chi2 = 0.0000 . . * Population averaged is similar to re (gives similar to mle version of re) . * Exactly sanme as xtgee, i(id) . xtreg LPAT LOGR, pa i(id) Iteration 1: tolerance = .13451485 Iteration 2: tolerance = .02869306 Iteration 3: tolerance = .01044356 Iteration 4: tolerance = .00391092 Iteration 5: tolerance = .00147322 Iteration 6: tolerance = .00055587 Iteration 7: tolerance = .00020985 Iteration 8: tolerance = .00007924 Iteration 9: tolerance = .00002992 Iteration 10: tolerance = .0000113 Iteration 11: tolerance = 4.267e-06 Iteration 12: tolerance = 1.611e-06 Iteration 13: tolerance = 6.086e-07 GEE population-averaged model Number of obs = 3460 Group variable: id Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: exchangeable max = 10 Wald chi2(1) = 754.80 Scale parameter: 1.347485 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .5618997 .0204523 27.474 0.000 .5218139 .6019855 _cons | 1.244435 .0603405 20.624 0.000 1.12617 1.362701 ------------------------------------------------------------------------------ . . * Check the fixed effects residuals . keep ufe id year . reshape wide ufe, i(id) j(year) (note: j = 70 71 72 73 74 75 76 77 78 79) Data long -> wide ----------------------------------------------------------------------------- Number of obs. 3460 -> 346 Number of variables 3 -> 11 j variable (10 values) year -> (dropped) xij variables: ufe -> ufe70 ufe71 ... ufe79 ----------------------------------------------------------------------------- . correlate ufe70 ufe71 ufe72 ufe73 ufe74 ufe75 ufe76 ufe77 ufe78 ufe79 (obs=346) | ufe70 ufe71 ufe72 ufe73 ufe74 ufe75 ufe76 ---------+--------------------------------------------------------------- ufe70 | 1.0000 ufe71 | 0.2412 1.0000 ufe72 | 0.1067 0.1823 1.0000 ufe73 | -0.0148 0.1622 0.0499 1.0000 ufe74 | -0.1073 -0.1570 -0.0084 -0.0740 1.0000 ufe75 | -0.2784 -0.2349 -0.3021 -0.0797 0.0155 1.0000 ufe76 | -0.2425 -0.3579 -0.3202 -0.1806 -0.0319 0.1061 1.0000 ufe77 | -0.2600 -0.3563 -0.2729 -0.3449 -0.2728 0.0507 0.1194 ufe78 | -0.1693 -0.2464 -0.2972 -0.2892 -0.1647 -0.1204 -0.0509 ufe79 | -0.3446 -0.3570 -0.2531 -0.2578 -0.1044 -0.0978 0.0332 | ufe77 ufe78 ufe79 ---------+--------------------------- ufe77 | 1.0000 ufe78 | 0.1799 1.0000 ufe79 | 0.1898 0.2290 1.0000 . corr ufe70 ufe71 ufe72 ufe73 ufe74 ufe75 ufe76 ufe77 ufe78 ufe79, cov (obs=346) | ufe70 ufe71 ufe72 ufe73 ufe74 ufe75 ufe76 ---------+--------------------------------------------------------------- ufe70 | .29159 ufe71 | .073999 .322917 ufe72 | .032812 .058986 .324179 ufe73 | -.004184 .048329 .014914 .275095 ufe74 | -.027953 -.043055 -.002318 -.018733 .232924 ufe75 | -.078395 -.069585 -.089681 -.02179 .00391 .27187 ufe76 | -.067565 -.104932 -.094054 -.048877 -.007937 .028545 .266166 ufe77 | -.074495 -.107436 -.082452 -.095969 -.069866 .014035 .032669 ufe78 | -.045937 -.070353 -.085046 -.076228 -.039933 -.031551 -.0132 ufe79 | -.099872 -.108872 -.07734 -.072557 -.027041 -.027359 .009184 | ufe77 ufe78 ufe79 ---------+--------------------------- ufe77 | .281495 ufe78 | .04797 .252517 ufe79 | .054049 .06176 .288047 . . . ****** XTGEE: LINEAR PANEL - CONSTANT COEFFICIENTS INDEPENDENCE OVER I . . use patr7079, clear . . * First OLS with no attempt to control for correlation over i for given t . * Same wrong estimates but divide by n not n-k in xtgee . regress LPAT LOGR Source | SS df MS Number of obs = 3460 ---------+------------------------------ F( 1, 3458) = 9163.58 Model | 9543.62221 1 9543.62221 Prob > F = 0.0000 Residual | 3601.41249 3458 1.04147267 R-squared = 0.7260 ---------+------------------------------ Adj R-squared = 0.7259 Total | 13145.0347 3459 3.80024131 Root MSE = 1.0205 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8429456 .0088058 95.727 0.000 .8256806 .8602106 _cons | .8988033 .0204519 43.947 0.000 .8587043 .9389022 ------------------------------------------------------------------------------ . xtgee LPAT LOGR, corr(independent) i(id) Iteration 1: tolerance = 3.508e-15 GEE population-averaged model Number of obs = 3460 Group variable: id Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: independent max = 10 Wald chi2(1) = 9168.88 Scale parameter: 1.040871 Prob > chi2 = 0.0000 Pearson chi2(3460): 3601.41 Deviance = 3601.41 Dispersion (Pearson): 1.040871 Dispersion = 1.040871 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8429456 .0088032 95.754 0.000 .8256916 .8601996 _cons | .8988033 .020446 43.960 0.000 .8587299 .9388766 ------------------------------------------------------------------------------ . . * Second OLS with attempt to control for correlation over i for given t . * These are equicorrelation / exchangeable same as random intercept . regress LPAT LOGR, cluster(id) Regression with robust standard errors Number of obs = 3460 F( 1, 345) = 1610.67 Prob > F = 0.0000 R-squared = 0.7260 Number of clusters (id) = 346 Root MSE = 1.0205 ------------------------------------------------------------------------------ | Robust LPAT | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8429456 .0210037 40.133 0.000 .8016341 .8842571 _cons | .8988033 .0522741 17.194 0.000 .7959871 1.001619 ------------------------------------------------------------------------------ . xtgee LPAT LOGR, corr(independent) i(id) robust Iteration 1: tolerance = 3.508e-15 GEE population-averaged model Number of obs = 3460 Group variable: id Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: independent max = 10 Wald chi2(1) = 1611.14 Scale parameter: 1.040871 Prob > chi2 = 0.0000 Pearson chi2(3460): 3601.41 Deviance = 3601.41 Dispersion (Pearson): 1.040871 Dispersion = 1.040871 (standard errors adjusted for clustering on id) ------------------------------------------------------------------------------ | Semi-robust LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8429456 .0210007 40.139 0.000 .801785 .8841062 _cons | .8988033 .0522666 17.197 0.000 .7963626 1.001244 ------------------------------------------------------------------------------ . . * Third GLS with attempts to control for correlation over i for given t . * These are equicorrelation / exchangeable same as random intercept . xtgee LPAT LOGR, i(id) Iteration 1: tolerance = .13451485 Iteration 2: tolerance = .02869306 Iteration 3: tolerance = .01044356 Iteration 4: tolerance = .00391092 Iteration 5: tolerance = .00147322 Iteration 6: tolerance = .00055587 Iteration 7: tolerance = .00020985 Iteration 8: tolerance = .00007924 Iteration 9: tolerance = .00002992 Iteration 10: tolerance = .0000113 Iteration 11: tolerance = 4.267e-06 Iteration 12: tolerance = 1.611e-06 Iteration 13: tolerance = 6.086e-07 GEE population-averaged model Number of obs = 3460 Group variable: id Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: exchangeable max = 10 Wald chi2(1) = 754.80 Scale parameter: 1.347485 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .5618997 .0204523 27.474 0.000 .5218139 .6019855 _cons | 1.244435 .0603405 20.624 0.000 1.12617 1.362701 ------------------------------------------------------------------------------ . xtgee LPAT LOGR, i(id) robust Iteration 1: tolerance = .13451485 Iteration 2: tolerance = .02869306 Iteration 3: tolerance = .01044356 Iteration 4: tolerance = .00391092 Iteration 5: tolerance = .00147322 Iteration 6: tolerance = .00055587 Iteration 7: tolerance = .00020985 Iteration 8: tolerance = .00007924 Iteration 9: tolerance = .00002992 Iteration 10: tolerance = .0000113 Iteration 11: tolerance = 4.267e-06 Iteration 12: tolerance = 1.611e-06 Iteration 13: tolerance = 6.086e-07 GEE population-averaged model Number of obs = 3460 Group variable: id Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: exchangeable max = 10 Wald chi2(1) = 417.80 Scale parameter: 1.347485 Prob > chi2 = 0.0000 (standard errors adjusted for clustering on id) ------------------------------------------------------------------------------ | Semi-robust LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .5618997 .0274901 20.440 0.000 .50802 .6157793 _cons | 1.244435 .0538751 23.099 0.000 1.138842 1.350029 ------------------------------------------------------------------------------ . . * Fourth GLS with attempts to control for correlation over i for given t . * These are AR(1) error . xtgee LPAT LOGR, corr(ar 1) i(id) t(year) Iteration 1: tolerance = .04390653 Iteration 2: tolerance = .00330958 Iteration 3: tolerance = .00037139 Iteration 4: tolerance = .00004272 Iteration 5: tolerance = 4.926e-06 Iteration 6: tolerance = 5.683e-07 GEE population-averaged model Number of obs = 3460 Group and time vars: id year Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: AR(1) max = 10 Wald chi2(1) = 1889.00 Scale parameter: 1.070925 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .7554645 .0173819 43.463 0.000 .7213965 .7895325 _cons | .9877598 .0427566 23.102 0.000 .9039584 1.071561 ------------------------------------------------------------------------------ . xtgee LPAT LOGR, corr(ar 1) i(id) t(year) robust Iteration 1: tolerance = .04390653 Iteration 2: tolerance = .00330958 Iteration 3: tolerance = .00037139 Iteration 4: tolerance = .00004272 Iteration 5: tolerance = 4.926e-06 Iteration 6: tolerance = 5.683e-07 GEE population-averaged model Number of obs = 3460 Group and time vars: id year Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: AR(1) max = 10 Wald chi2(1) = 1271.85 Scale parameter: 1.070925 Prob > chi2 = 0.0000 (standard errors adjusted for clustering on id) ------------------------------------------------------------------------------ | Semi-robust LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .7554645 .0211835 35.663 0.000 .7139457 .7969833 _cons | .9877598 .0505761 19.530 0.000 .8886325 1.086887 ------------------------------------------------------------------------------ . . * Fifth GLS with attempts to control for correlation over i for given t . * These are unstructured correlations similar to MA(T-1) . xtgee LPAT LOGR, corr(unstructured) i(id) t(year) Iteration 1: tolerance = .13702967 Iteration 2: tolerance = .02343495 Iteration 3: tolerance = .00842342 Iteration 4: tolerance = .00331556 Iteration 5: tolerance = .00128104 Iteration 6: tolerance = .00048966 Iteration 7: tolerance = .00018608 Iteration 8: tolerance = .0000705 Iteration 9: tolerance = .00002667 Iteration 10: tolerance = .00001008 Iteration 11: tolerance = 3.810e-06 Iteration 12: tolerance = 1.439e-06 Iteration 13: tolerance = 5.437e-07 GEE population-averaged model Number of obs = 3460 Group and time vars: id year Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: unstructured max = 10 Wald chi2(1) = 782.28 Scale parameter: 1.292814 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .5885625 .0210432 27.969 0.000 .5473186 .6298065 _cons | 1.238972 .0585841 21.149 0.000 1.124149 1.353795 ------------------------------------------------------------------------------ . xtgee LPAT LOGR, corr(unstructured) i(id) t(year) robust Iteration 1: tolerance = .13702967 Iteration 2: tolerance = .02343495 Iteration 3: tolerance = .00842342 Iteration 4: tolerance = .00331556 Iteration 5: tolerance = .00128104 Iteration 6: tolerance = .00048966 Iteration 7: tolerance = .00018608 Iteration 8: tolerance = .0000705 Iteration 9: tolerance = .00002667 Iteration 10: tolerance = .00001008 Iteration 11: tolerance = 3.810e-06 Iteration 12: tolerance = 1.439e-06 Iteration 13: tolerance = 5.437e-07 GEE population-averaged model Number of obs = 3460 Group and time vars: id year Number of groups = 346 Link: identity Obs per group: min = 10 Family: Gaussian avg = 10.0 Correlation: unstructured max = 10 Wald chi2(1) = 551.44 Scale parameter: 1.292814 Prob > chi2 = 0.0000 (standard errors adjusted for clustering on id) ------------------------------------------------------------------------------ | Semi-robust LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .5885625 .0250637 23.483 0.000 .5394386 .6376865 _cons | 1.238972 .0518335 23.903 0.000 1.13738 1.340564 ------------------------------------------------------------------------------ . . . ****** XTGLS: LINEAR PANEL - CONSTANT COEFFICIENTS CORRELATION OVER I . . * These are not suitable for the data here but do for illustration . * These require MATSIZE of at least 346 . * Intercooled Stata default is 40 . set matsize 400 . * and also need more memory . clear . set memory 40m (40960k) . use patr7079, clear . . * This gives same as OLS and same wrong standard errors as OLS . xtgls LPAT LOGR, i(id) t(year) Cross-sectional time-series FGLS regression Coefficients: generalized least squares Panels: homoscedastic Correlation: no autocorrelation Estimated covariances = 1 Number of obs = 3460 Estimated autocorrelations = 0 Number of groups = 346 Estimated coefficients = 2 No. of time periods= 10 Wald chi2(1) = 9168.88 Log likelihood = -4978.827 Pr > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8429456 .0088032 95.754 0.000 .8256916 .8601996 _cons | .8988033 .020446 43.960 0.000 .8587299 .9388766 ------------------------------------------------------------------------------ . . * The next two are not suitable for the data here as they assume independence > over i . * Tey still understate true standard errors as ignore clustering over i . * panels(hetero) does GLS with different variance over i . * panels(correlated) does GLS with different variance over i and correlated o > ver i . xtgls LPAT LOGR, i(id) t(year) panels(hetero) Cross-sectional time-series FGLS regression Coefficients: generalized least squares Panels: heteroscedastic Correlation: no autocorrelation Estimated covariances = 346 Number of obs = 3460 Estimated autocorrelations = 0 Number of groups = 346 Estimated coefficients = 2 No. of time periods= 10 Wald chi2(1) = 27973.35 Log likelihood = -4200.892 Pr > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8513676 .0050903 167.252 0.000 .8413907 .8613444 _cons | .9539549 .0154898 61.586 0.000 .9235955 .9843144 ------------------------------------------------------------------------------ . xtgls LPAT LOGR, i(id) t(year) panels(correlated) Cross-sectional time-series FGLS regression Coefficients: generalized least squares Panels: heteroscedastic with cross-sectional correlation Correlation: no autocorrelation Estimated covariances = 60031 Number of obs = 3460 Estimated autocorrelations = 0 Number of groups = 346 Estimated coefficients = 2 No. of time periods= 10 Wald chi2(1) = 55.54 Log likelihood = . Pr > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .676804 .0908132 7.453 0.000 .4988133 .8547947 _cons | .7904485 .0976066 8.098 0.000 .5991431 .9817538 ------------------------------------------------------------------------------ Note: You estimated at least as many quantities as you have observations. . . * The next two are possible alternatives to random effects / equicorrleation . * They permit AR(1) correlation over t . * corr(ar1) has same rho for different i . * corr(psar1) has different rho for different i . xtgls LPAT LOGR, i(id) t(year) corr(ar1) Cross-sectional time-series FGLS regression Coefficients: generalized least squares Panels: homoscedastic Correlation: common AR(1) coefficient for all panels (0.6489) Estimated covariances = 1 Number of obs = 3460 Estimated autocorrelations = 1 Number of groups = 346 Estimated coefficients = 2 No. of time periods= 10 Wald chi2(1) = 3186.80 Log likelihood = -3660.193 Pr > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .7939342 .0140639 56.452 0.000 .7663694 .821499 _cons | .9452998 .0337064 28.045 0.000 .8792364 1.011363 ------------------------------------------------------------------------------ . xtgls LPAT LOGR, i(id) t(year) corr(psar1) Cross-sectional time-series FGLS regression Coefficients: generalized least squares Panels: homoscedastic Correlation: panel-specific AR(1) Estimated covariances = 1 Number of obs = 3460 Estimated autocorrelations = 346 Number of groups = 346 Estimated coefficients = 2 No. of time periods= 10 Wald chi2(1) = 3590.78 Log likelihood = -3252.024 Pr > chi2 = 0.0000 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .7525244 .0125582 59.923 0.000 .7279109 .7771379 _cons | .9141163 .0210558 43.414 0.000 .8728476 .9553849 ------------------------------------------------------------------------------ . . . * To use xtpcse need to use tsset . * These take forever and are commented out . * Here I use independent option so no correlation over i and same variance . * This should be comparable to XTGEE independent . tsset id year, yearly panel variable: id, 1 to 346 time variable: year, . to . . xtpcse LPAT LOGR, corr(independent) independent Linear regression, independent panels corrected standard errors Group variable: id Number of obs = 3460 Time variable: year Number of groups = 346 Panels: independent (balanced) Obs per group: min = 10 Autocorrelation: no autocorrelation avg = 10 max = 10 Estimated covariances = 1 R-squared = 0.7260 Estimated autocorrelations = 0 Wald chi2(1) = 9168.88 Estimated coefficients = 2 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ | Indep-corrected | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .8429456 .0088032 95.754 0.000 .8256916 .8601996 _cons | .8988033 .020446 43.960 0.000 .8587299 .9388766 ------------------------------------------------------------------------------ . xtpcse LPAT LOGR, corr(ar1) independent (note: estimates of rho outside {-1,1} bounded to be in the range {-1,1}) Prais-Winsten regression, independent panels corrected standard errors Group variable: id Number of obs = 3460 Time variable: year Number of groups = 346 Panels: independent (balanced) Obs per group: min = 10 Autocorrelation: common AR(1) avg = 10 max = 10 Estimated covariances = 1 R-squared = 0.5244 Estimated autocorrelations = 1 Wald chi2(1) = 3244.67 Estimated coefficients = 2 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ | Indep-corrected | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .7950857 .0139582 56.962 0.000 .7677281 .8224432 _cons | .9440678 .0334278 28.242 0.000 .8785505 1.009585 ------------------------------------------------------------------------------ rho | .6444006 ------------------------------------------------------------------------------ . xtpcse LPAT LOGR, corr(psar1) independent (note: estimates of rho outside {-1,1} bounded to be in the range {-1,1}) Prais-Winsten regression, independent panels corrected standard errors Group variable: id Number of obs = 3460 Time variable: year Number of groups = 346 Panels: independent (balanced) Obs per group: min = 10 Autocorrelation: panel-specific AR(1) avg = 10 max = 10 Estimated covariances = 1 R-squared = 0.5578 Estimated autocorrelations = 346 Wald chi2(1) = 3595.28 Estimated coefficients = 2 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ | Indep-corrected | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .7511973 .0125282 59.961 0.000 .7266425 .7757521 _cons | .9133408 .0209716 43.551 0.000 .8722373 .9544443 ------------------------------------------------------------------------------ rhos = 1 .299009 .4619561 .9889363 -.4727211 ... .9653904 ------------------------------------------------------------------------------ . . . ****** XTGLS: LINEAR PANEL - RANDOM COEFFICIENTS MODEL . . * This is the only more complicated random effects than random intercept . * Stata does not have an equivalent of SAS proc mixed . xtrchh LPAT LOGR, i(id) t(year) Hildreth-Houck random-coefficients regression Number of obs = 3460 Group variable (i) : id Number of groups = 346 Obs per group: min = 10 avg = 10.0 max = 10 Wald chi2(1) = 7.57 Prob > chi2 = 0.0059 ------------------------------------------------------------------------------ LPAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .2275745 .082714 2.751 0.006 .0654581 .389691 _cons | 1.657762 .1899571 8.727 0.000 1.285453 2.030071 ------------------------------------------------------------------------------ Test of parameter constancy: chi2(690) = 28403.12 Prob > chi2 = 0.0000 . . . ********** NONLINEAR PANEL REGRESSION . . * Note that in the first xt command need to give , i(id) . * to indicate that the ith observation is for the ith id . * Time invariant regressors LOGK SCISECT are not included . . use patr7079, clear . . . ****** XTPOIS: POISSON RANDOM AND FIXED EFFECTS . * . * Poisson Cross-section with Poisson standard errors . poisson PAT LOGR Iteration 0: log likelihood = -40068.501 Iteration 1: log likelihood = -40068.498 Poisson regression Number of obs = 3460 LR chi2(1) = 229997.33 Prob > chi2 = 0.0000 Log likelihood = -40068.498 Pseudo R2 = 0.7416 ------------------------------------------------------------------------------ PAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .7023903 .0015537 452.081 0.000 .6993452 .7054355 _cons | 1.756943 .0067067 261.967 0.000 1.743798 1.770087 ------------------------------------------------------------------------------ . . * Poisson Cross-section with robust standard errors . poisson PAT LOGR, robust Iteration 0: log likelihood = -40068.501 Iteration 1: log likelihood = -40068.498 Poisson regression Number of obs = 3460 Wald chi2(1) = 2614.82 Prob > chi2 = 0.0000 Log likelihood = -40068.498 Pseudo R2 = 0.7416 ------------------------------------------------------------------------------ | Robust PAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .7023903 .0137359 51.135 0.000 .6754684 .7293123 _cons | 1.756943 .0418732 41.959 0.000 1.674873 1.839012 ------------------------------------------------------------------------------ . . * Poisson fixed effects . xtpois PAT LOGR, fe i(id) Note: 8 groups (80 obs) dropped due to all zero outcomes. Iteration 0: log likelihood = -10331.876 Iteration 1: log likelihood = -10181.446 Iteration 2: log likelihood = -10181.44 Iteration 3: log likelihood = -10181.44 Conditional fixed-effects Poisson Number of obs = 3380 Group variable (i) : id Number of groups = 338 Obs per group: min = 10 avg = 10.0 max = 10 Wald chi2(1) = 302.12 Log likelihood = -10181.44 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ PAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .2414198 .0138895 17.381 0.000 .2141969 .2686427 ------------------------------------------------------------------------------ . . * Poisson random effects . xtpois PAT LOGR, re i(id) Fitting comparison Poisson model: Iteration 0: log likelihood = -40068.501 Iteration 1: log likelihood = -40068.498 Fitting full model: Iteration 0: log likelihood = -12714.766 Iteration 1: log likelihood = -12322.29 Iteration 2: log likelihood = -12305.004 Iteration 3: log likelihood = -12304.637 Iteration 4: log likelihood = -12304.636 Random-effects Poisson Number of obs = 3460 Group variable (i) : id Number of groups = 346 Random effects u_i ~ Gamma Obs per group: min = 10 avg = 10.0 max = 10 Wald chi2(1) = 521.29 Log likelihood = -12304.636 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ PAT | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LOGR | .3195335 .0139952 22.832 0.000 .2921035 .3469635 _cons | 2.463614 .0806162 30.560 0.000 2.305609 2.621619 ---------+-------------------------------------------------------------------- /lnalpha | .4602225 .071276 .3205242 .5999208 ---------+-------------------------------------------------------------------- alpha | 1.584426 .1129315 1.37785 1.821975 ------------------------------------------------------------------------------ Likelihood ratio test of alpha=0: chi2(1) = 55527.72 Prob > chi2 = 0.0000 . . . ********** CLOSE OUTPUT . log close