------------------------------------------------------------------------------------------------------------------------------- name: log: c:\acdbookrevision\stata_final_programs_2013\racd05.txt log type: text opened on: 15 Jan 2013, 16:36:44 . . ********** OVERVIEW OF racd05.do ********** . . * STATA Program . * copyright C 2013 by A. Colin Cameron and Pravin K. Trivedi . * used for "Regression Analyis of Count Data" SECOND EDITION . * by A. Colin Cameron and Pravin K. Trivedi (2013) . * Cambridge University Press . . * Chapter 5 . * 5.2.5 BASICS . * 5.3.4 GOODNESS-OF-FIT . . * To run you need file . * racd05data.dta . * in your directory . * and Stata user-written command . * countfit . . ********** SETUP ********** . . set more off . version 12 . clear all . set linesize 82 . set scheme s1mono // Graphics scheme . . ************ . . * This STATA program does analysis of takeover bids studied in chapter 5 . * 5.2.5 RESIDUALS . * 5.3.4 R-SQUARED and GOODNESS-OF-FIT . * 5.4.2 TESTS OF NONNESTED MODELS . . . ********** DATA DESCRIPTION . . * The original data are from Sanjiv Jaggia and Satish Thosar, 1993, . * "Multiple Bids as a Consequence of Target Management Resistance" . * Review of Quantitative Finance and Accounting, 447-457. . * The data are also used in . * A.C. Cameron and Per Johansson (1997), . * "Count Data Regression Models using Series Expansions: with Applications", . * Journal of Applied Econometrics, May, Vol. 12, pp.203-223. . . * For more details see these datasets and racd05makedata.dta . . *************** 5.2.5 TAKEOVER BIDS: DESCRIPTIVE STATISTICS . . use racd05data.dta, clear . . summarize Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- DOCNO | 126 82174.41 2251.783 78001 85059 WEEKS | 126 11.44898 7.711424 2.857 41.429 NUMBIDS | 126 1.738095 1.432081 0 10 TAKEOVER | 126 1 0 1 1 BIDPREM | 126 1.346806 .189325 .942675 2.066366 -------------+-------------------------------------------------------- INSTHOLD | 126 .2518175 .1856136 0 .904 SIZE | 126 1.219031 3.096624 .017722 22.169 LEGLREST | 126 .4285714 .4968472 0 1 REALREST | 126 .1825397 .3878308 0 1 FINREST | 126 .1031746 .3054011 0 1 -------------+-------------------------------------------------------- REGULATN | 126 .2698413 .4456492 0 1 WHTKNGHT | 126 .5952381 .4928054 0 1 SIZESQ | 126 10.99902 59.91479 .000314 491.4646 CONSTANT | 126 1 0 1 1 . describe Contains data from racd05data.dta obs: 126 vars: 14 7 Jun 2011 10:36 size: 7,056 ---------------------------------------------------------------------------------- storage display value variable name type format label variable label ---------------------------------------------------------------------------------- DOCNO float %9.0g Document Number WEEKS float %9.0g Weeks NUMBIDS float %9.0g Number of takeover bids TAKEOVER float %9.0g Equals 1 if taken over BIDPREM float %9.0g Bid price divided by price 14 working days before bid INSTHOLD float %9.0g Percentage of stock held by institutions SIZE float %9.0g Total book valiue of assets in billions of dollars LEGLREST float %9.0g Equals 1 if legal defense by lawsuit REALREST float %9.0g Equals 1 if proposed changes in asset structure FINREST float %9.0g Equals 1 i proposed changes in ownership structure REGULATN float %9.0g Equals 1 if intervention by federal regulators WHTKNGHT float %9.0g Equals 1 if management invitation for friendly third-party bid SIZESQ float %9.0g SIZE Squared CONSTANT float %9.0g ---------------------------------------------------------------------------------- Sorted by: . . global XLIST LEGLREST REALREST FINREST WHTKNGHT BIDPREM INSTHOLD SIZE SIZESQ REG > ULATN . . *** TABLE 5.1: ACTUAL FREQUENCY DISTRIBUTION . . tabulate NUMBIDS Number of | takeover | bids | Freq. Percent Cum. ------------+----------------------------------- 0 | 9 7.14 7.14 1 | 63 50.00 57.14 2 | 31 24.60 81.75 3 | 12 9.52 91.27 4 | 6 4.76 96.03 5 | 1 0.79 96.83 6 | 2 1.59 98.41 7 | 1 0.79 99.21 10 | 1 0.79 100.00 ------------+----------------------------------- Total | 126 100.00 . . *** TABLE 5.2: VARIABLE DEFINITIONS AND SUMMARY STATISTCS . . describe NUMBIDS $XLIST storage display value variable name type format label variable label ---------------------------------------------------------------------------------- NUMBIDS float %9.0g Number of takeover bids LEGLREST float %9.0g Equals 1 if legal defense by lawsuit REALREST float %9.0g Equals 1 if proposed changes in asset structure FINREST float %9.0g Equals 1 i proposed changes in ownership structure WHTKNGHT float %9.0g Equals 1 if management invitation for friendly third-party bid BIDPREM float %9.0g Bid price divided by price 14 working days before bid INSTHOLD float %9.0g Percentage of stock held by institutions SIZE float %9.0g Total book valiue of assets in billions of dollars SIZESQ float %9.0g SIZE Squared REGULATN float %9.0g Equals 1 if intervention by federal regulators . summarize NUMBIDS $XLIST Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- NUMBIDS | 126 1.738095 1.432081 0 10 LEGLREST | 126 .4285714 .4968472 0 1 REALREST | 126 .1825397 .3878308 0 1 FINREST | 126 .1031746 .3054011 0 1 WHTKNGHT | 126 .5952381 .4928054 0 1 -------------+-------------------------------------------------------- BIDPREM | 126 1.346806 .189325 .942675 2.066366 INSTHOLD | 126 .2518175 .1856136 0 .904 SIZE | 126 1.219031 3.096624 .017722 22.169 SIZESQ | 126 10.99902 59.91479 .000314 491.4646 REGULATN | 126 .2698413 .4456492 0 1 . . *************** 5.2.5 (Continued) RESIDUALS . . *** TABLE 5.3: POISSON QMLE Estimates, Standard Errors, T-statistics . . poisson NUMBIDS $XLIST, vce(robust) Iteration 0: log pseudolikelihood = -184.9518 Iteration 1: log pseudolikelihood = -184.94833 Iteration 2: log pseudolikelihood = -184.94833 Poisson regression Number of obs = 126 Wald chi2(9) = 34.98 Prob > chi2 = 0.0001 Log pseudolikelihood = -184.94833 Pseudo R2 = 0.0825 ------------------------------------------------------------------------------ | Robust NUMBIDS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- LEGLREST | .2601464 .1250534 2.08 0.037 .0150463 .5052465 REALREST | -.1956597 .1816167 -1.08 0.281 -.5516219 .1603025 FINREST | .0740301 .263571 0.28 0.779 -.4425597 .5906198 WHTKNGHT | .4813822 .1064947 4.52 0.000 .2726563 .690108 BIDPREM | -.6776958 .2974241 -2.28 0.023 -1.260636 -.0947553 INSTHOLD | -.3619912 .3231799 -1.12 0.263 -.9954122 .2714297 SIZE | .1785026 .0623544 2.86 0.004 .0562902 .3007149 SIZESQ | -.0075693 .0027788 -2.72 0.006 -.0130157 -.002123 REGULATN | -.0294392 .1420508 -0.21 0.836 -.3078537 .2489753 _cons | .9860598 .4137383 2.38 0.017 .1751477 1.796972 ------------------------------------------------------------------------------ . . *** OVERDISPERSION TESTS presented in text - here underdispersion . * Estimate from Pearson statistic divided by (n-k) . quietly glm NUMBIDS $XLIST, family(poisson) . display "Var = phi*E[y] where phi = " e(dispers_ps) Var = phi*E[y] where phi = .74639511 . * LM Overdispersion test - here underdispersion . quietly poisson NUMBIDS $XLIST . predict mu, n . generate ystar = ((NUMBIDS - mu)^2 - NUMBIDS) / mu . * Test against NB2 variance . regress ystar mu, noconstant Source | SS df MS Number of obs = 126 -------------+------------------------------ F( 1, 125) = 1.41 Model | 2.06997594 1 2.06997594 Prob > F = 0.2377 Residual | 183.855052 125 1.47084041 R-squared = 0.0111 -------------+------------------------------ Adj R-squared = 0.0032 Total | 185.925028 126 1.47559546 Root MSE = 1.2128 ------------------------------------------------------------------------------ ystar | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mu | -.0682968 .0575706 -1.19 0.238 -.1822362 .0456425 ------------------------------------------------------------------------------ . * Test against NB1 variance . regress ystar, vce(robust) Linear regression Number of obs = 126 F( 0, 125) = 0.00 Prob > F = . R-squared = 0.0000 Root MSE = 1.1772 ------------------------------------------------------------------------------ | Robust ystar | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | -.3175595 .1048714 -3.03 0.003 -.525113 -.110006 ------------------------------------------------------------------------------ . sum mu ystar Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- mu | 126 1.738095 .7106812 .7156423 4.427624 ystar | 126 -.3175595 1.177179 -1.284358 5.945698 . drop mu ystar . . *** CONSTRUCT RESIDUALS after command glm . * NOTE: Stata glm uses different terminology from the book . * Stata standardized multiplies residual by (1-h_ii)^(-1/2) . * We call this studentized (our star) . * Stata studentized multiplies residual by one over the . * estimated square root of the estimated scale parameter . * NOTE: Deviance residual differs from that in First Edition (error in first) . quietly glm NUMBIDS $XLIST, family(poisson) . predict mu, mu . generate raw = NUMBIDS - mu . predict pear, pearson . predict pearstar, pearson standardized . predict dev, deviance . predict devstar, deviance standardized . generate devadj = dev + 1/(6*sqrt(mu)) . predict anscombe, anscombe . predict hat, hat . * Extras for completeness . predict pearstud, pearson studentized . predict pearstan, pearson standardized . predict devstud, deviance studentized . predict devstan, deviance standardized . . *** TABLE 5.4: DESCRIPTIVE STATISTICS FOR VARIOUS REDSIDUALS . . summarize raw pear pearstar dev devstar devadj anscombe Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- raw | 126 -4.73e-09 1.22863 -3.225367 5.572376 pear | 126 .0015625 .8322573 -1.606458 3.026831 pearstar | 126 -.0028455 .8857773 -1.87161 3.111568 dev | 126 -.0898632 .8371262 -2.271875 2.397711 devstar | 126 -.0991351 .8906361 -2.365741 2.666335 -------------+-------------------------------------------------------- devadj | 126 .0436852 .838656 -2.168127 2.519731 anscombe | 126 -.0968019 .8535656 -2.409687 2.415112 . tabstat raw pear pearstar dev devstar devadj anscombe, /// > statistics(mean sd skew kurt min p10 p90 max) col(stat) format(%9.2f) variable | mean sd skewness kurtosis min p10 -------------+------------------------------------------------------------ raw | -0.00 1.23 1.38 7.48 -3.23 -1.29 pear | 0.00 0.83 1.12 4.99 -1.61 -0.96 pearstar | -0.00 0.89 1.12 5.19 -1.87 -0.99 dev | -0.09 0.84 0.29 3.88 -2.27 -1.11 devstar | -0.10 0.89 0.30 4.03 -2.37 -1.27 devadj | 0.04 0.84 0.25 3.82 -2.17 -1.03 anscombe | -0.10 0.85 0.21 3.93 -2.41 -1.11 -------------------------------------------------------------------------- variable | p90 max -------------+-------------------- raw | 1.33 5.57 pear | 1.01 3.03 pearstar | 1.05 3.11 dev | 0.93 2.40 devstar | 0.94 2.67 devadj | 1.03 2.52 anscombe | 0.93 2.42 ---------------------------------- . . *** TABLE 5.5: CORRELATIONS OF VARIOUS REDSIDUALS . . correlate raw pear pearstar dev devstar devadj anscombe (obs=126) | raw pear pearstar dev devstar devadj anscombe -------------+--------------------------------------------------------------- raw | 1.0000 pear | 0.9759 1.0000 pearstar | 0.9830 0.9976 1.0000 dev | 0.9564 0.9839 0.9813 1.0000 devstar | 0.9637 0.9822 0.9843 0.9975 1.0000 devadj | 0.9549 0.9830 0.9804 0.9996 0.9973 1.0000 anscombe | 0.9512 0.9801 0.9772 0.9997 0.9969 0.9992 1.0000 . . *** RESIDUAL PLOTS (several) . . * Anscombe residual plotted against y . label variable anscombe "Anscombe residual" . graph twoway scatter anscombe NUMBIDS, msize(medium) xlabel(#6) saving(racd05gra > ph1, replace) (file racd05graph1.gph saved) . * graph twoway (scatter anscombe NUMBIDS, msize(medium)) /// > * (lowess anscombe NUMBIDS, lwidth(medthick)), xlabel(#6) saving(racd05graph1, > replace) . . * Anscombe residual plotted against fitted mean . label variable mu "Predicted bids" . graph twoway scatter anscombe mu, msize(medium) xlabel(#6) saving(racd05graph2, > replace) (file racd05graph2.gph saved) . * graph twoway (scatter anscombe mu, msize(medium)) /// > * (lowess anscombe mu, lwidth(medthick)), xlabel(#6) saving(racd05graph2, repl > ace) . . * Ordered anscombe residual plotted against standard normal ordinates . * NOTE: Axes reversed from the First Edition . qnorm anscombe, msize(medium) xlabel(#6) saving(racd05graph3, replace) (file racd05graph3.gph saved) . . * Diagonal entries in Hat matrix for each observation plotted against observatio > n number . generate obsno = _n . label variable obsno "Observation number" . label variable hat "Diagonal entry in H" . graph twoway scatter hat obsno, msize(medium) xlabel(#6) saving(racd05graph4, re > place) (file racd05graph4.gph saved) . . *** FIGURE 5.1: RESIDUAL PLOTS . . graph combine racd05graph1.gph racd05graph2.gph racd05graph3.gph racd05graph4.gp > h, /// > iscale(0.7) ysize(5) xsize(6) rows(2) . graph export racd05fig1.eps, replace (file racd05fig1.eps written in EPS format) . graph export racd05fig1.wmf, replace (file c:\acdbookrevision\stata_final_programs_2013\racd05fig1.wmf written in Windo > ws Metafile format) . . * Identify and drop the observations with largest HAT matrix diagonal term . poisson NUMBIDS $XLIST, vce(robust) Iteration 0: log pseudolikelihood = -184.9518 Iteration 1: log pseudolikelihood = -184.94833 Iteration 2: log pseudolikelihood = -184.94833 Poisson regression Number of obs = 126 Wald chi2(9) = 34.98 Prob > chi2 = 0.0001 Log pseudolikelihood = -184.94833 Pseudo R2 = 0.0825 ------------------------------------------------------------------------------ | Robust NUMBIDS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- LEGLREST | .2601464 .1250534 2.08 0.037 .0150463 .5052465 REALREST | -.1956597 .1816167 -1.08 0.281 -.5516219 .1603025 FINREST | .0740301 .263571 0.28 0.779 -.4425597 .5906198 WHTKNGHT | .4813822 .1064947 4.52 0.000 .2726563 .690108 BIDPREM | -.6776958 .2974241 -2.28 0.023 -1.260636 -.0947553 INSTHOLD | -.3619912 .3231799 -1.12 0.263 -.9954122 .2714297 SIZE | .1785026 .0623544 2.86 0.004 .0562902 .3007149 SIZESQ | -.0075693 .0027788 -2.72 0.006 -.0130157 -.002123 REGULATN | -.0294392 .1420508 -0.21 0.836 -.3078537 .2489753 _cons | .9860598 .4137383 2.38 0.017 .1751477 1.796972 ------------------------------------------------------------------------------ . estimates store PFULL . scalar kreg = e(k) . scalar Nobs = e(N) . list obsno hat if hat > 3*kreg/Nobs +------------------+ | obsno hat | |------------------| 36. | 36 .2756452 | 80. | 80 .3174862 | 83. | 83 .6960669 | 85. | 85 .3207826 | 102. | 102 .2830565 | |------------------| 126. | 126 .2971494 | +------------------+ . poisson NUMBIDS $XLIST if hat < 3*kreg/Nobs, vce(robust) Iteration 0: log pseudolikelihood = -170.38986 Iteration 1: log pseudolikelihood = -170.38984 Poisson regression Number of obs = 120 Wald chi2(9) = 48.01 Prob > chi2 = 0.0000 Log pseudolikelihood = -170.38984 Pseudo R2 = 0.0698 ------------------------------------------------------------------------------ | Robust NUMBIDS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- LEGLREST | .2588834 .1232742 2.10 0.036 .0172705 .5004964 REALREST | -.3575586 .1975979 -1.81 0.070 -.7448433 .029726 FINREST | .2322629 .2469026 0.94 0.347 -.2516573 .7161831 WHTKNGHT | .4961819 .1059455 4.68 0.000 .2885325 .7038313 BIDPREM | -.9555442 .2964309 -3.22 0.001 -1.536538 -.3745504 INSTHOLD | -.2576956 .3202307 -0.80 0.421 -.8853361 .369945 SIZE | .0887005 .140571 0.63 0.528 -.1868135 .3642146 SIZESQ | .0059204 .0263884 0.22 0.822 -.0457998 .0576407 REGULATN | -.0430669 .1382526 -0.31 0.755 -.314037 .2279033 _cons | 1.381025 .399858 3.45 0.001 .5973183 2.164733 ------------------------------------------------------------------------------ . estimates store PNOOUTLIERS . estimates table PFULL PNOOUTLIERS, b(%9.3f) se stats(ll) -------------------------------------- Variable | PFULL PNOOUTL~S -------------+------------------------ LEGLREST | 0.260 0.259 | 0.125 0.123 REALREST | -0.196 -0.358 | 0.182 0.198 FINREST | 0.074 0.232 | 0.264 0.247 WHTKNGHT | 0.481 0.496 | 0.106 0.106 BIDPREM | -0.678 -0.956 | 0.297 0.296 INSTHOLD | -0.362 -0.258 | 0.323 0.320 SIZE | 0.179 0.089 | 0.062 0.141 SIZESQ | -0.008 0.006 | 0.003 0.026 REGULATN | -0.029 -0.043 | 0.142 0.138 _cons | 0.986 1.381 | 0.414 0.400 -------------+------------------------ ll | -184.948 -170.390 -------------------------------------- legend: b/se . . *************** 5.3.4 R-SQUARED and CHISQUARE GOODNESS-OF-FIT . . *** Deviance, Pearson and R-squared measures presented in text . * Fitted model . glm NUMBIDS $XLIST, family(poisson) vce(robust) Iteration 0: log pseudolikelihood = -185.75208 Iteration 1: log pseudolikelihood = -184.95135 Iteration 2: log pseudolikelihood = -184.94833 Iteration 3: log pseudolikelihood = -184.94833 Generalized linear models No. of obs = 126 Optimization : ML Residual df = 116 Scale parameter = 1 Deviance = 88.61503283 (1/df) Deviance = .7639227 Pearson = 86.58183302 (1/df) Pearson = .7463951 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] AIC = 3.094418 Log pseudolikelihood = -184.9483263 BIC = -472.3937 ------------------------------------------------------------------------------ | Robust NUMBIDS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- LEGLREST | .2601464 .1250534 2.08 0.037 .0150463 .5052465 REALREST | -.1956597 .1816167 -1.08 0.281 -.5516219 .1603025 FINREST | .0740301 .263571 0.28 0.779 -.4425597 .5906198 WHTKNGHT | .4813822 .1064947 4.52 0.000 .2726563 .690108 BIDPREM | -.6776958 .2974241 -2.28 0.023 -1.260636 -.0947553 INSTHOLD | -.3619912 .3231799 -1.12 0.263 -.9954122 .2714297 SIZE | .1785026 .0623544 2.86 0.004 .0562902 .3007149 SIZESQ | -.0075693 .0027788 -2.72 0.006 -.0130157 -.002123 REGULATN | -.0294392 .1420508 -0.21 0.836 -.3078537 .2489753 _cons | .9860598 .4137383 2.38 0.017 .1751477 1.796972 ------------------------------------------------------------------------------ . display "Deviance Statistic = " e(deviance) Deviance Statistic = 88.615033 . display "Pearson Statistic = " e(deviance_p) Pearson Statistic = 86.581833 . scalar Devfitted = e(deviance) . scalar Pearsfitted = e(deviance_p) . * Intercept-only model . quietly glm NUMBIDS, family(poisson) vce(robust) . display "Deviance Statistic = " e(deviance) Deviance Statistic = 121.86157 . display "Pearson Statistic = " e(deviance_p) Pearson Statistic = 147.49315 . scalar Devintercept = e(deviance) . scalar Pearsintercept = e(deviance_p) . * Calculate R-squared Deviance and Pearson . scalar R2_Dev = 1 - Devfitted/Devintercept . scalar R2_Pears = 1 - Pearsfitted/Pearsintercept . display "Deviance R-squared = " R2_Dev " Fitted = " Devfitted " Intercept = > " Devintercept Deviance R-squared = .27282218 Fitted = 88.615033 Intercept = 121.86157 . display "Pearson R-squared = " R2_Pears " Fitted = " Pearsfitted " Intercep > t = " Pearsintercept Pearson R-squared = .41297726 Fitted = 86.581833 Intercept = 147.49315 . * Squared correlation coefficient . capture drop mu . quietly poisson NUMBIDS $XLIST, vce(robust) . predict mu, n . quietly correlate NUMBIDS mu . display "Squared correlation coefficient = " r(rho)^2 Squared correlation coefficient = .26426804 . * Compare to OLS . quietly regress NUMBIDS $XLIST . display "OLS R-squared = " e(r2) OLS R-squared = .23730025 . . *** Predicted Probabilities and begin Chi-square Goodness-of-fit test . . ** In January 2013 there was a forthcoming Stata hournal article and . ** user-written addon to implement chisquare goodness of fit test. . . * This program written for categories j = 0, 1, 2, ..., $REST or more . global Y NUMBIDS . global MAXCOUNT = 4 // Form cells y = 0, 1, 2, ... , maxcount . global REST = 5 // The remaining category y >= $REST . * Create indicators for y = 0, 1, 2, ...., maxcount and y >= $REST . forvalues i = 0/$MAXCOUNT { 2. generate Dummy`i' = $Y==`i' 3. } . generate Dummy$REST = $Y > $MAXCOUNT . * Create corresponding predicted probabilites of y = 0, 1, 2, ... . quietly poisson $Y $XLIST . forvalues i = 0/$MAXCOUNT { 2. predict Predicted`i', pr(`i') 3. } . predict Predicted$REST, pr($REST,.) . * The preceding required Stata 12. Could instead use user-written addon countfit . * or use recursion for Poisson probabilities as follows .. . /* > quietly poisson $Y $XLIST > capture drop mu > predict mu, n > generate Predicted0 = exp(-mu) > forvalues i = 1/$MAXCOUNT { > local j = `i' - 1 > generate Predicted`i' = Predicted`j'*mu/`i' > } > generate Predicted$REST = 1 > forvalues i = 0/$MAXCOUNT { > replace Predicted$REST = Predicted$REST - Predicted`i' > } > */ . * Create differences between actual and predicted . forvalues i = 0/$REST { 2. generate Difference`i' = Dummy`i' - Predicted`i' 3. } . . *** TABLE 5.6: ACTUAL AND PREDICTED FREQUENCIES . . summarize P* D* Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- Predicted0 | 126 .2132476 .1130779 .0119428 .488878 Predicted1 | 126 .2976829 .0747807 .0528784 .3678793 Predicted2 | 126 .232677 .0388698 .1170628 .2706697 Predicted3 | 126 .1366833 .0563692 .0298633 .2236638 Predicted4 | 126 .0680005 .0497715 .0053429 .1953667 -------------+-------------------------------------------------------- Predicted5 | 126 .0517086 .0767211 .0008662 .4541059 DOCNO | 126 82174.41 2251.783 78001 85059 Dummy0 | 126 .0714286 .2585675 0 1 Dummy1 | 126 .5 .501996 0 1 Dummy2 | 126 .2460317 .4324166 0 1 -------------+-------------------------------------------------------- Dummy3 | 126 .0952381 .2947154 0 1 Dummy4 | 126 .047619 .213809 0 1 Dummy5 | 126 .0396825 .1959916 0 1 Difference0 | 126 -.1418191 .2665421 -.488878 .9242796 Difference1 | 126 .2023171 .4843285 -.3677845 .9382253 -------------+-------------------------------------------------------- Difference2 | 126 .0133547 .4267367 -.2706697 .8536685 Difference3 | 126 -.0414452 .2909759 -.2236638 .9470747 Difference4 | 126 -.0203815 .2075399 -.1953667 .9305699 Difference5 | 126 -.0120261 .181479 -.4151017 .958654 . . *** Continue Chi-square Goodness-of-fit test . . * Obtain the scores to be used later . generate score = $Y - mu . foreach var of varlist $XLIST { 2. generate scorefor`var' = score*`var' 3. local i = `i' + 1 4. } . * Run the auxiliary regression . generate ones = 1 . quietly regress ones Difference* score scorefor*, noconstant . scalar CGOF = e(N)*e(r2) . di "Chi-square GOF Test: " CGOF " p-value: " chi2tail($MAXCOUNT,CGOF) Chi-square GOF Test: 48.659953 p-value: 6.875e-10 . . * Compare to Stata user-written command countfit . countfit NUMBIDS $XLIST, maxcount(10) prm nograph noestimates nofit Comparison of Mean Observed and Predicted Count Maximum At Mean Model Difference Value |Diff| --------------------------------------------- PRM 0.202 1 0.042 PRM: Predicted and actual probabilities Count Actual Predicted |Diff| Pearson ------------------------------------------------ 0 0.071 0.213 0.142 11.884 1 0.500 0.298 0.202 17.325 2 0.246 0.233 0.013 0.097 3 0.095 0.137 0.041 1.583 4 0.048 0.068 0.020 0.770 5 0.008 0.031 0.023 2.106 6 0.016 0.013 0.003 0.090 7 0.008 0.005 0.003 0.187 8 0.000 0.002 0.002 0.253 9 0.000 0.001 0.001 0.095 10 0.008 0.000 0.008 27.275 ------------------------------------------------ Sum 1.000 1.000 0.458 61.666 . . * Aside: Stata command estat gof is a quite different test . * of whether deviance statistic is stat. different from chisquare(n-k) . quietly glm NUMBIDS $XLIST, family(poisson) vce(robust) . display chi2tail((e(N)-e(k)),e(deviance)) .97244872 . quietly poisson NUMBIDS $XLIST, vce(robust) . estat gof Deviance goodness-of-fit = 88.61504 Prob > chi2(116) = 0.9724 Pearson goodness-of-fit = 86.58183 Prob > chi2(116) = 0.9812 . . * Classification table (Confusion matrrix) . * Find the mode probability for each observation (i.e. k than maximizes Pr[y = k > ] . generate mode = 0 . forvalues i = 1/5 { 2. local j = `i' - 1 3. quietly replace mode = `i' if Predicted`i'> Predicted`j' 4. } . * Compare the actual count to the predicted mode . generate NUMBIDSgrouped = NUMBIDS . replace NUMBIDSgrouped = $REST if NUMBIDS > $REST (4 real changes made) . tabulate NUMBIDSgrouped mode NUMBIDSgro | mode uped | 0 1 2 5 | Total -----------+--------------------------------------------+---------- 0 | 2 5 2 0 | 9 1 | 9 43 10 1 | 63 2 | 0 22 6 3 | 31 3 | 1 5 4 2 | 12 4 | 0 1 3 2 | 6 5 | 0 1 2 2 | 5 -----------+--------------------------------------------+---------- Total | 12 77 27 10 | 126 . tabulate mode mode | Freq. Percent Cum. ------------+----------------------------------- 0 | 12 9.52 9.52 1 | 77 61.11 70.63 2 | 27 21.43 92.06 5 | 10 7.94 100.00 ------------+----------------------------------- Total | 126 100.00 . count if NUMBIDSgrouped == mode 53 . . *************** 5.4.2 NON-NESTED MODELS: AIC, BIC and Vuong TEST . . * Poisson . poisson NUMBIDS $XLIST Iteration 0: log likelihood = -184.9518 Iteration 1: log likelihood = -184.94833 Iteration 2: log likelihood = -184.94833 Poisson regression Number of obs = 126 LR chi2(9) = 33.25 Prob > chi2 = 0.0001 Log likelihood = -184.94833 Pseudo R2 = 0.0825 ------------------------------------------------------------------------------ NUMBIDS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- LEGLREST | .2601464 .1509594 1.72 0.085 -.0357286 .5560213 REALREST | -.1956597 .1926309 -1.02 0.310 -.5732093 .1818899 FINREST | .0740301 .2165219 0.34 0.732 -.3503452 .4984053 WHTKNGHT | .4813822 .1588698 3.03 0.002 .170003 .7927613 BIDPREM | -.6776958 .3767372 -1.80 0.072 -1.416087 .0606956 INSTHOLD | -.3619912 .4243292 -0.85 0.394 -1.193661 .4696788 SIZE | .1785026 .0600221 2.97 0.003 .0608614 .2961438 SIZESQ | -.0075693 .0031217 -2.42 0.015 -.0136878 -.0014509 REGULATN | -.0294392 .1605682 -0.18 0.855 -.344147 .2852686 _cons | .9860598 .5339201 1.85 0.065 -.0604044 2.032524 ------------------------------------------------------------------------------ . estat ic ----------------------------------------------------------------------------- Model | Obs ll(null) ll(model) df AIC BIC -------------+--------------------------------------------------------------- . | 126 -201.5716 -184.9483 10 389.8967 418.2595 ----------------------------------------------------------------------------- Note: N=Obs used in calculating BIC; see [R] BIC note . estimates store POISSON . * Hurdle logit / Poisson . hplogit NUMBIDS $XLIST initial: log likelihood = -254.30412 alternative: log likelihood = -226.28023 rescale: log likelihood = -226.28023 rescale eq: log likelihood = -195.17087 Iteration 0: log likelihood = -195.17087 Iteration 1: log likelihood = -187.21209 Iteration 2: log likelihood = -160.07158 Iteration 3: log likelihood = -159.50181 Iteration 4: log likelihood = -159.48681 Iteration 5: log likelihood = -159.47862 Iteration 6: log likelihood = -159.47747 Iteration 7: log likelihood = -159.47746 Poisson-Logit Hurdle Regression Number of obs = 126 Wald chi2(9) = 11.83 Log likelihood = -159.47746 Prob > chi2 = 0.2230 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- logit | LEGLREST | .9712886 .976998 0.99 0.320 -.9435923 2.88617 REALREST | -2.722899 .9997995 -2.72 0.006 -4.68247 -.763328 FINREST | -1.466672 1.174169 -1.25 0.212 -3.768001 .8346562 WHTKNGHT | 1.192886 .8733562 1.37 0.172 -.5188602 2.904633 BIDPREM | .8245185 2.483786 0.33 0.740 -4.043612 5.692649 INSTHOLD | -1.838757 2.411417 -0.76 0.446 -6.565049 2.887534 SIZE | .3478178 1.019675 0.34 0.733 -1.650708 2.346344 SIZESQ | .0126446 .1849397 0.07 0.945 -.3498306 .3751198 REGULATN | -1.141261 .9822143 -1.16 0.245 -3.066366 .7838435 _cons | 2.14836 3.472277 0.62 0.536 -4.657177 8.953898 -------------+---------------------------------------------------------------- poisson | LEGLREST | .4356921 .2145263 2.03 0.042 .0152282 .8561559 REALREST | -.0038302 .2473243 -0.02 0.988 -.4885769 .4809165 FINREST | .2651092 .273213 0.97 0.332 -.2703785 .8005969 WHTKNGHT | .8780368 .2760094 3.18 0.001 .3370683 1.419005 BIDPREM | -1.347424 .5342481 -2.52 0.012 -2.394531 -.3003171 INSTHOLD | -.6607018 .6081372 -1.09 0.277 -1.852629 .5312252 SIZE | .2381462 .0756031 3.15 0.002 .0899668 .3863256 SIZESQ | -.0102873 .0039627 -2.60 0.009 -.0180541 -.0025205 REGULATN | -.0571749 .2231117 -0.26 0.798 -.4944658 .3801159 _cons | 1.136037 .7603113 1.49 0.135 -.3541459 2.62622 ------------------------------------------------------------------------------ AIC Statistic = 2.690 . estat ic ----------------------------------------------------------------------------- Model | Obs ll(null) ll(model) df AIC BIC -------------+--------------------------------------------------------------- . | 126 . -159.4775 20 358.9549 415.6806 ----------------------------------------------------------------------------- Note: N=Obs used in calculating BIC; see [R] BIC note . estimates store PHURDLE . * ZIP . quietly zip NUMBIDS $XLIST, inflate($XLIST) . estat ic ----------------------------------------------------------------------------- Model | Obs ll(null) ll(model) df AIC BIC -------------+--------------------------------------------------------------- . | 126 -194.5065 -179.9954 20 399.9908 456.7164 ----------------------------------------------------------------------------- Note: N=Obs used in calculating BIC; see [R] BIC note . estimates store ZIP . . *** VUONG TEST presented in text . zip NUMBIDS $XLIST, inflate($XLIST) vuong Fitting constant-only model: Iteration 0: log likelihood = -271.80732 Iteration 1: log likelihood = -203.50285 (not concave) Iteration 2: log likelihood = -203.04135 (not concave) Iteration 3: log likelihood = -201.03128 Iteration 4: log likelihood = -197.55236 Iteration 5: log likelihood = -195.80465 Iteration 6: log likelihood = -194.77842 Iteration 7: log likelihood = -194.57139 Iteration 8: log likelihood = -194.52119 Iteration 9: log likelihood = -194.50947 Iteration 10: log likelihood = -194.50697 Iteration 11: log likelihood = -194.50653 Iteration 12: log likelihood = -194.50648 Iteration 13: log likelihood = -194.50647 Fitting full model: Iteration 0: log likelihood = -194.50647 Iteration 1: log likelihood = -181.46232 Iteration 2: log likelihood = -179.99738 Iteration 3: log likelihood = -179.9954 Iteration 4: log likelihood = -179.9954 Zero-inflated Poisson regression Number of obs = 126 Nonzero obs = 117 Zero obs = 9 Inflation model = logit LR chi2(9) = 29.02 Log likelihood = -179.9954 Prob > chi2 = 0.0006 ------------------------------------------------------------------------------ NUMBIDS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- NUMBIDS | LEGLREST | .2177038 .152802 1.42 0.154 -.0817825 .5171902 REALREST | -.0315728 .1993239 -0.16 0.874 -.4222405 .3590949 FINREST | .1535876 .220583 0.70 0.486 -.2787472 .5859224 WHTKNGHT | .3814168 .1611181 2.37 0.018 .0656311 .6972024 BIDPREM | -.6668731 .3744586 -1.78 0.075 -1.400798 .0670522 INSTHOLD | -.3662654 .4213391 -0.87 0.385 -1.192075 .459544 SIZE | .1645811 .0606754 2.71 0.007 .0456594 .2835027 SIZESQ | -.0070204 .0031485 -2.23 0.026 -.0131913 -.0008494 REGULATN | .0367102 .1630259 0.23 0.822 -.2828146 .356235 _cons | 1.042256 .5298909 1.97 0.049 .0036886 2.080823 -------------+---------------------------------------------------------------- inflate | LEGLREST | -67.86428 42421.59 -0.00 0.999 -83212.65 83076.92 REALREST | 122.6682 76025.39 0.00 0.999 -148884.4 149129.7 FINREST | 37.73304 111246.2 0.00 1.000 -218000.8 218076.3 WHTKNGHT | -38.80397 146692.6 -0.00 1.000 -287551.1 287473.5 BIDPREM | -49.96548 384647 -0.00 1.000 -753944.2 753844.3 INSTHOLD | 116.0081 186299.9 0.00 1.000 -365025.2 365257.2 SIZE | -7.262702 66890.29 -0.00 1.000 -131109.8 131095.3 SIZESQ | .3918664 3302.474 0.00 1.000 -6472.339 6473.122 REGULATN | 76.77467 56540.79 0.00 0.999 -110741.1 110894.7 _cons | -76.91042 531108.8 -0.00 1.000 -1041031 1040877 ------------------------------------------------------------------------------ Vuong test of zip vs. standard Poisson: z = 2.05 Pr>z = 0.0200 . . *** TABLE 5.7: AIC and BIC . . * Does not list coefficients of all the regressors . estimates table POISSON PHURDLE ZIP, b(%9.1f) keep(LEGLREST) /// > stats(N k ll aic bic) equations(1) -------------------------------------------------- Variable | POISSON PHURDLE ZIP -------------+------------------------------------ LEGLREST | 0.3 1.0 0.2 -------------+------------------------------------ N | 126 126 126 k | 10.0 20.0 20.0 ll | -184.9 -159.5 -180.0 aic | 389.9 359.0 400.0 bic | 418.3 415.7 456.7 -------------------------------------------------- . . ********** CLOSE OUTPUT . . * log close . end of do-file . exit, clear