------------------------------------------------------------------------------------------------------ log: c:\Imbook\bwebpage\Section2\mma08p1cmtests.txt log type: text opened on: 17 May 2005, 14:04:20 . . ********** OVERVIEW OF MMA08P1CMTESTS.DO ********** . . * STATA Program . * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi . * used for "Microeconometrics: Methods and Applications" . * by A. Colin Cameron and Pravin K. Trivedi (2005) . * Cambridge University Press . . * Chapter 8.2.6 pages 269-71 . * Conditional moment tests example producing Table 8.1 . . * (A) TEST OF THE CONDITIONAL MEAN . * (B) TEST THAT CONDITIONAL VARIANCE = MEAN . * (C) ALTERNATIVE TEST THAT CONDITIONAL VARIANCE = MEAN . * (D) INFORMATION MATRIX TEST . * (E) CHI-SQUARE GOODNESS OF FIT TEST . * for a Poisson model with generated data (see below). . . * The data generation requires free Stata add-on command rndpoix . * In Stata: search rndpoix . . ********** SETUP ********** . . set more off . version 8.0 . set scheme s1mono /* Used for graphs */ . . ********** GENERATE DATA ********** . . * Model is . * y ~ Poisson[exp(b1 + b2*x2] . * where . * x2 is iid ~ N[0,1] . * and b1=0 and b2=1. . . set seed 10001 . set obs 200 obs was 0, now 200 . scalar b1 = 0 . scalar b2 = 1 . . * Generate regressors . gen x2 = invnorm(uniform()) . . * Generate y . gen mupoiss = exp(b1+b2*x2) . * The next requires Stata add-on. In Stata: search rndpoix . rndpoix(mupoiss) ( Generating ................ ) Variable xp created. . gen y = xp . . * Write data to a text (ascii) file so can use with programs other than Stata . outfile y x2 using mma08p1cmtests.asc, replace . . ********* POISSON REGRESSION ********** . . poisson y x2 Iteration 0: log likelihood = -263.53818 Iteration 1: log likelihood = -263.5288 Iteration 2: log likelihood = -263.5288 Poisson regression Number of obs = 200 LR chi2(1) = 321.75 Prob > chi2 = 0.0000 Log likelihood = -263.5288 Pseudo R2 = 0.3791 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x2 | 1.12402 .0687868 16.34 0.000 .9892006 1.25884 _cons | -.1652935 .089065 -1.86 0.063 -.3398578 .0092707 ------------------------------------------------------------------------------ . * Obtain exp(x'b) . . * Obtain the scores to be used later . predict yhat (option n assumed; predicted number of events) . * For the Poisson s = dlnf(y)/db = (y - exp(x'b))*x . gen s1 = (y - yhat) . gen s2 = (y - yhat)*x2 . . * Summarize data . * Should get s1 and s2 summing to zero . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- x2 | 200 -.0091098 1.010072 -2.857666 2.149822 mupoiss | 200 1.599601 1.674071 .0574026 8.58333 xp | 200 1.525 2.363749 0 15 y | 200 1.525 2.363749 0 15 yhat | 200 1.525 1.803242 .0341372 9.498652 -------------+-------------------------------------------------------- s1 | 200 1.36e-09 1.36719 -3.148933 6.245292 s2 | 200 6.69e-09 1.889198 -6.420406 12.97311 . . ********** ANALYSIS: CONDITIONAL MOMENTS TESTS ********** . . * The program is appropriate for MLE with density assumed to be correctly specified. . * Let H0: E[m(y,x,theta)] = 0 . * Then CM = explained sum of squares or N times uncentered Rsq from . * auxiliary regression of 1 on m and the components of s = dlnf(y)//dtheta . * The test is chi-squared with dim(m) degrees of freedom. . . * Define the dependent variable one for the aucxiliary regressions . gen one = 1 . . *** (A) TEST OF THE CONDITIONAL MEAN (Table 8.1 p.270 row 1) . . * Test H0: E[(y - exp(x'b))*z] = 0 where z = x2sq . . * A smilar test is relevant for many nonlinear models . * Just change the expression for the conditional mean. . * Here we used E[y|x] = exp(x'b) for the Poisson . * Also for the Poisson z cannot be x as this sums to zero by Poisson foc . * For some other models (basically non-LEF models) z can be x . . gen z = x2*x2 . gen mA = (y - yhat)*z . regress one mA s1 s2, noconstant Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 3, 197) = 1.09 Model | 3.27177115 3 1.09059038 Prob > F = 0.3536 Residual | 196.728229 197 .998620451 R-squared = 0.0164 -------------+------------------------------ Adj R-squared = 0.0014 Total | 200 200 1 Root MSE = .99931 ------------------------------------------------------------------------------ one | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mA | .1046155 .0577969 1.81 0.072 -.0093646 .2185956 s1 | -.0377486 .0822939 -0.46 0.647 -.2000387 .1245415 s2 | -.1544278 .1029465 -1.50 0.135 -.3574463 .0485908 ------------------------------------------------------------------------------ . scalar CMA = e(N)*e(r2) . di "CMA: " CMA " p-value: " chi2tail(1,CMA) CMA: 3.2717711 p-value: .07048149 . . * Check that three different ways give same answer. . di "N times Uncentered R-squared: " e(N)*e(r2) N times Uncentered R-squared: 3.2717711 . di "Explained Sum of Squares: " e(mss) Explained Sum of Squares: 3.2717711 . di "N minus Residual Sum of Squares: " e(N) - e(rss) N minus Residual Sum of Squares: 3.2717711 . . *** (B) TEST THAT CONDITIONAL VARIANCE = MEAN (Table 8.1 p.270 row 2) . . * Test H0: E[{(y - exp(x'b))^2 - exp(x'b)}*x] = 0 . . * This test is peculiar to Poisson which restricts mean = variance . . * Here m has 2 terms . gen mB1 = ((y - yhat)^2 - yhat) . gen mB2 = ((y - yhat)^2 - yhat)*x2 . regress one mB1 mB2 s1 s2, noconstant Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 4, 196) = 0.60 Model | 2.43400011 4 .608500026 Prob > F = 0.6604 Residual | 197.566 196 1.0079898 R-squared = 0.0122 -------------+------------------------------ Adj R-squared = -0.0080 Total | 200 200 1 Root MSE = 1.004 ------------------------------------------------------------------------------ one | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mB1 | .0432045 .0542516 0.80 0.427 -.0637873 .1501963 mB2 | -.0052374 .0357193 -0.15 0.884 -.0756808 .065206 s1 | -.0399879 .1073712 -0.37 0.710 -.251739 .1717633 s2 | -.003196 .0852726 -0.04 0.970 -.1713655 .1649735 ------------------------------------------------------------------------------ . scalar CMB = e(N)*e(r2) . di "CMB: " CMB " p-value: " chi2tail(2,CMB) CMB: 2.4340001 p-value: .29611717 . . *** (C) ALTERNATIVE TEST THAT CONDITIONAL VARIANCE = MEAN (Table 8.1 p.270 row 3) . . * Test H0: E[{(y - exp(x'b))^2 - y}*x] = 0 . . * This test is peculiar to Poisson which restricts mean = variance . * This test is also peculiar as here dm/db = 0 . . * Here m has 2 terms . gen mC1 = ((y - yhat)^2 - y) . gen mC2 = ((y - yhat)^2 - y)*x2 . . * To be consistent with other tests include s1 and s2. . regress one mC1 mC2 s1 s2, noconstant Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 4, 196) = 0.60 Model | 2.43400011 4 .608500027 Prob > F = 0.6604 Residual | 197.566 196 1.0079898 R-squared = 0.0122 -------------+------------------------------ Adj R-squared = -0.0080 Total | 200 200 1 Root MSE = 1.004 ------------------------------------------------------------------------------ one | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mC1 | .0432045 .0542516 0.80 0.427 -.0637873 .1501963 mC2 | -.0052374 .0357192 -0.15 0.884 -.0756808 .065206 s1 | .0032166 .0825345 0.04 0.969 -.1595531 .1659863 s2 | -.0084334 .0641096 -0.13 0.895 -.1348665 .1179997 ------------------------------------------------------------------------------ . scalar CMC = e(N)*e(r2) . di "CMC: " CMC " p-value: " chi2tail(2,CMC) CMC: 2.4340001 p-value: .29611717 . . * Since dm/db = 0 could just do the regression without the scores . regress one mC1 mC2, noconstant Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 2, 198) = 1.21 Model | 2.40695177 2 1.20347588 Prob > F = 0.3016 Residual | 197.593048 198 .997944688 R-squared = 0.0120 -------------+------------------------------ Adj R-squared = 0.0021 Total | 200 200 1 Root MSE = .99897 ------------------------------------------------------------------------------ one | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mC1 | .0458705 .0510111 0.90 0.370 -.0547243 .1464652 mC2 | -.0075807 .03212 -0.24 0.814 -.0709218 .0557605 ------------------------------------------------------------------------------ . scalar CMCnoscores = e(N)*e(r2) . di "CMCnoscores: " CMC " p-value: " chi2tail(2,CMCnoscores) CMCnoscores: 2.4340001 p-value: .30014911 . . *** (D) INFORMATION MATRIX TEST (Table 8.1 p.270 row 4) . . * Test H0: E[{(y - exp(x'b))^2 - y}*vech(xx')] = 0 . . * A similar test is relevant for other parametric models . * In general m = vech(d2lnf(y)/dbdb') . * and for Poisson this yields above . . * Here m is a 3x1 vector . gen mD1 = ((y - yhat)^2 - y) . gen mD2 = ((y - yhat)^2 - y)*x2 . gen mD3 = ((y - yhat)^2 - y)*x2*x2 . . * To be consistent with other tests include s1 and s2. . regress one mD1 mD2 mD3 s1 s2, noconstant Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 5, 195) = 0.58 Model | 2.9463051 5 .58926102 Prob > F = 0.7129 Residual | 197.053695 195 1.01053177 R-squared = 0.0147 -------------+------------------------------ Adj R-squared = -0.0105 Total | 200 200 1 Root MSE = 1.0053 ------------------------------------------------------------------------------ one | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mD1 | .0546342 .0566422 0.96 0.336 -.0570759 .1663442 mD2 | -.0712751 .0994042 -0.72 0.474 -.2673205 .1247703 mD3 | .0330527 .0464213 0.71 0.477 -.0584996 .124605 s1 | -.0098554 .0846533 -0.12 0.907 -.176809 .1570982 s2 | -.0146441 .0647803 -0.23 0.821 -.1424041 .1131158 ------------------------------------------------------------------------------ . scalar CMD = e(N)*e(r2) . di "CMD: " CMD " p-value: " chi2tail(3,CMD) CMD: 2.9463051 p-value: .39997818 . . * Since dm/db = 0 could just do the regression without the scores . regress one mD1 mD2 mD3, noconstant Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 3, 197) = 0.91 Model | 2.73445751 3 .911485837 Prob > F = 0.4370 Residual | 197.265542 197 1.00134793 R-squared = 0.0137 -------------+------------------------------ Adj R-squared = -0.0013 Total | 200 200 1 Root MSE = 1.0007 ------------------------------------------------------------------------------ one | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mD1 | .056165 .054176 1.04 0.301 -.0506743 .1630043 mD2 | -.056325 .0911035 -0.62 0.537 -.2359884 .1233384 mD3 | .0233527 .0408339 0.57 0.568 -.057175 .1038805 ------------------------------------------------------------------------------ . scalar CMDnoscores = e(N)*e(r2) . di "CMDnoscores: " CMDnoscores " p-value: " chi2tail(3,CMDnoscores) CMDnoscores: 2.7344575 p-value: .43440333 . . *** (E) CHI-SQUARE GOODNESS OF FIT TEST (Table 8.1 p.270 row 5) . . * Test H0: E[{d_j - Pr[y = j]] = 0 . * where d_j = 1 if y = j for j = 0, 1, 2, and 3 or more . * and Pr[y = j] = exp(-lamda)*lamda^y/y! for lamda = exp(x'b) . * Cells get too small if have more cells than up to 3 or more. . . * A similar test is relevant for other parametric models, . * though a natural partitioning for y may be less obvious. . . * Here m has 4 terms . gen d0 = 0 . replace d0 = 1 if y==0 (87 real changes made) . gen d1 = 0 . replace d1 = 1 if y==1 (51 real changes made) . gen d2 = 0 . replace d2 = 1 if y==2 (22 real changes made) . gen p0 = exp(-yhat) . gen p1 = exp(-yhat)*yhat . gen p2 = exp(-yhat)*(yhat^2)/2 . gen mE1 = d0 - p0 . gen mE2 = d1 - p1 . gen mE3 = d2 - p2 . regress one mE1 mE2 mE3 s1 s2, noconstant Source | SS df MS Number of obs = 200 -------------+------------------------------ F( 5, 195) = 0.49 Model | 2.50056717 5 .500113433 Prob > F = 0.7807 Residual | 197.499433 195 1.0128176 R-squared = 0.0125 -------------+------------------------------ Adj R-squared = -0.0128 Total | 200 200 1 Root MSE = 1.0064 ------------------------------------------------------------------------------ one | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mE1 | 1.020078 .7290569 1.40 0.163 -.4177712 2.457927 mE2 | .7149016 .5053259 1.41 0.159 -.2817042 1.711507 mE3 | .2705081 .383646 0.71 0.482 -.4861201 1.027136 s1 | .2916116 .2217763 1.31 0.190 -.1457765 .7289997 s2 | -.1341565 .1125046 -1.19 0.235 -.3560384 .0877255 ------------------------------------------------------------------------------ . scalar CME = e(N)*e(r2) . di "CME: " CME " p-value: " chi2tail(3,CME) CME: 2.5005672 p-value: .47518859 . . * Wrong alternative is basic chisquare . quietly sum d0 . scalar sumd0 = r(sum) . quietly sum d1 . scalar sumd1 = r(sum) . quietly sum d2 . scalar sumd2 = r(sum) . scalar sumd3 = 1 - sumd0 - sumd1 - sumd2 . quietly sum p0 . scalar sump0 = r(sum) . quietly sum p1 . scalar sump1 = r(sum) . quietly sum p2 . scalar sump2 = r(sum) . scalar sump3 = 1 - sump0 - sump1 - sump2 . scalar chisq = (sumd0-sump0)^2/sump0 + (sumd1-sump1)^2/sump1 /* > */ + (sumd2-sump2)^2/sump2 + (sumd3-sump3)^2/sump3 . di "Wrong Traditional chi-square: " chisq " p = " chi2tail(3,chisq) Wrong Traditional chi-square: .47431003 p = .92449803 . . . ********** DISPLAY RESULTS (Table 8.1 p.270) ********** . . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- x2 | 200 -.0091098 1.010072 -2.857666 2.149822 mupoiss | 200 1.599601 1.674071 .0574026 8.58333 xp | 200 1.525 2.363749 0 15 y | 200 1.525 2.363749 0 15 yhat | 200 1.525 1.803242 .0341372 9.498652 -------------+-------------------------------------------------------- s1 | 200 1.36e-09 1.36719 -3.148933 6.245292 s2 | 200 6.69e-09 1.889198 -6.420406 12.97311 one | 200 1 0 1 1 z | 200 1.015227 1.286795 .0000877 8.166255 mA | 200 .1563713 3.403966 -13.52498 26.94856 -------------+-------------------------------------------------------- mB1 | 200 .334863 3.470417 -6.436038 30.24896 mB2 | 200 .43869 5.749749 -11.74974 62.83503 mC1 | 200 .334863 3.077815 -6.838236 24.00367 mC2 | 200 .43869 4.897291 -12.484 49.86192 mD1 | 200 .334863 3.077815 -6.838236 24.00367 -------------+-------------------------------------------------------- mD2 | 200 .43869 4.897291 -12.484 49.86192 mD3 | 200 .8381842 9.190652 -22.791 103.5763 d0 | 200 .435 .4970011 0 1 d1 | 200 .255 .436955 0 1 d2 | 200 .11 .3136749 0 1 -------------+-------------------------------------------------------- p0 | 200 .429237 .2918348 .000075 .9664389 p1 | 200 .2406035 .1137756 .000712 .367864 p2 | 200 .1235594 .0894167 .0005631 .2706694 mE1 | 200 .005763 .4287003 -.9289918 .9571021 mE2 | 200 .0143965 .4210301 -.367864 .9315748 -------------+-------------------------------------------------------- mE3 | 200 -.0135594 .3065698 -.2706694 .9688674 . . * Gives Rows 1-5 of Table 8.1 (The CMxnoscores are not reported) . di "CMA: " CMA " p-value: " chi2tail(1,CMA) CMA: 3.2717711 p-value: .07048149 . di "CMB: " CMB " p-value: " chi2tail(2,CMB) CMB: 2.4340001 p-value: .29611717 . di "CMC: " CMC " p-value: " chi2tail(2,CMC) CMC: 2.4340001 p-value: .29611717 . di "CMD: " CMD " p-value: " chi2tail(3,CMD) CMD: 2.9463051 p-value: .39997818 . di "CME: " CME " p-value: " chi2tail(3,CME) CME: 2.5005672 p-value: .47518859 . di "CMCnoscores: " CMCnoscores " p-value: " chi2tail(2,CMCnoscores) CMCnoscores: 2.4069518 p-value: .30014911 . di "CMDnoscores: " CMDnoscores " p-value: " chi2tail(3,CMDnoscores) CMDnoscores: 2.7344575 p-value: .43440333 . . ********** FURTHER ANALYSIS gives M** column in Table 8.1 ********** . . * The following drops the scores from the regression. Provides lower bound. . * Results are reported in last column in Table 8.1 . quietly regress one mA, noconstant . di "CMA without scores:" e(N)*e(r2) " with p = " chi2tail(1,e(N)*e(r2)) CMA without scores:.42328231 with p = .51530376 . quietly regress one mB1 mB2, noconstant . di "CMB without scores:" e(N)*e(r2) " with p = " chi2tail(2,e(N)*e(r2)) CMB without scores:1.8897296 with p = .38873213 . quietly regress one mC1 mC2, noconstant . di "CMC without scores:" e(N)*e(r2) " with p = " chi2tail(2,e(N)*e(r2)) CMC without scores:2.4069518 with p = .30014911 . quietly regress one mD1 mD2 mD3, noconstant . di "CMD without scores:" e(N)*e(r2) " with p = " chi2tail(3,e(N)*e(r2)) CMD without scores:2.7344575 with p = .43440333 . quietly regress one mE1 mE2 mE3, noconstant . di "CME without scores:" e(N)*e(r2) " with p = " chi2tail(3,e(N)*e(r2)) CME without scores:.73842732 with p = .86413036 . . ********** CLOSE OUTPUT . log close log: c:\Imbook\bwebpage\Section2\mma08p1cmtests.txt log type: text closed on: 17 May 2005, 14:04:20 ----------------------------------------------------------------------------------------------------