. di "stboot.do by Colin Cameron: Stata bootstrap example using Poisson" stboot.do by Colin Cameron: Stata bootstrap example using Poisson . . . ********** MEMORY MANAGEMENT . * . set maxvar 100 width 1000 (maxvar and maxobs no longer need be set with this version of Stata) . * If need more memory then in Stata give command help memory . . . ********** READ DATA . infile using jaggia.dct dictionary using jaggia.asc { _column(1) docno %16.8f "Document Number" _column(17) weeks %17.8f "Weeks" _column(34) numbids %17.8f "Number of takeover bids (after the first)" _column(51) takeover %17.8f "1 if takeover occurred, 0 otherwise" _newline _column(1) bidprem %16.8f "Bid price / price 14 work days before bid" _column(17) insthold %17.8f "Percentage of stock held by insitutions" _column(34) size %17.8f "Total book value of assets inb $billions " _column(51) leglrest %17.8f "Equals one if legal defense by lawsuit" _newline _column(1) realrest %16.8f "One if proposed changes in asset structure" _column(17) finrest %17.8f "One if proposed changes in ownership struc" _column(34) regulatn %17.8f "One if intervention by fed regulators" _column(57) whtknght %17.8f "One if mgmt invite friendly 3rd-party bid" * For this example the column numbers are redundant. * Also the long labels need not be given. } (126 observations read) . * To drop off extra blanks (if any) at end of file jaggia.asc . drop if _n>126 (0 observations deleted) . . . ********** DATA TRANSFORMATIONS . gen sizesq = size*size . label variable sizesq "size squared" . . . ******** CHECK DATA: DESCRIPTIVE STATISTICS . describe Contains data obs: 126 vars: 13 size: 7,056 (99.2% of memory free) ------------------------------------------------------------------------------- 1. docno float %9.0g Document Number 2. weeks float %9.0g Weeks 3. numbids float %9.0g Number of takeover bids (after the first) 4. takeover float %9.0g 1 if takeover occurred, 0 otherwise 5. bidprem float %9.0g Bid price / price 14 work days before bid 6. insthold float %9.0g Percentage of stock held by insitutions 7. size float %9.0g Total book value of assets inb $billions 8. leglrest float %9.0g Equals one if legal defense by lawsuit 9. realrest float %9.0g One if proposed changes in asset structure 10. finrest float %9.0g One if proposed changes in ownership struc 11. regulatn float %9.0g One if intervention by fed regulators 12. whtknght float %9.0g One if mgmt invite friendly 3rd-party bid 13. sizesq float %9.0g size squared ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved . summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- docno | 126 82174.41 2251.783 78001 85059 weeks | 126 11.44898 7.711424 2.857 41.429 numbids | 126 1.738095 1.432081 0 10 takeover | 126 1 0 1 1 bidprem | 126 1.346806 .189325 .9426754 2.066366 insthold | 126 .2518175 .1856136 0 .904 size | 126 1.219031 3.096624 .017722 22.169 leglrest | 126 .4285714 .4968472 0 1 realrest | 126 .1825397 .3878308 0 1 finrest | 126 .1031746 .3054011 0 1 regulatn | 126 .2698413 .4456492 0 1 whtknght | 126 .5952381 .4928054 0 1 sizesq | 126 10.99902 59.91479 .0003141 491.4646 . . . ********** POISSON REGRESSION . . * Here as command spans more than one line use /* and */ . . . ********** BOOTSTRAP POISSON REGRESSION . * . * First run the initial model once to get parameter estimates . poisson numbids leglrest realrest finrest whtknght bidprem insthold size /* > */ sizesq regulatn Iteration 0: log likelihood = -184.9518 Iteration 1: log likelihood = -184.94833 Iteration 2: log likelihood = -184.94833 Poisson regression Number of obs = 126 LR chi2(9) = 33.25 Prob > chi2 = 0.0001 Log likelihood = -184.94833 Pseudo R2 = 0.0825 ------------------------------------------------------------------------------ numbids | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- leglrest | .2601464 .1509594 1.723 0.085 -.0357286 .5560213 realrest | -.1956597 .1926309 -1.016 0.310 -.5732093 .1818899 finrest | .07403 .2165219 0.342 0.732 -.3503452 .4984053 whtknght | .4813822 .1588698 3.030 0.002 .170003 .7927613 bidprem | -.6776959 .3767373 -1.799 0.072 -1.416087 .0606956 insthold | -.3619913 .4243292 -0.853 0.394 -1.193661 .4696788 size | .1785026 .0600221 2.974 0.003 .0608614 .2961438 sizesq | -.0075693 .0031217 -2.425 0.015 -.0136878 -.0014509 regulatn | -.0294392 .1605682 -0.183 0.855 -.344147 .2852686 _cons | .9860599 .5339202 1.847 0.065 -.0604044 2.032524 ------------------------------------------------------------------------------ . * . * Second do bootstrap . * Need to give the model, initial parameter estimates (in _b[ ]) and number r > eps . * In addition save each rep as one line in a Stata file . * And set random seed so that get same results next time around . * . capture erase bsstboot.dta . set seed 10001 . #delimit ; delimiter now ; . bs "poisson numbids leglrest realrest finrest whtknght bidprem insthold size > sizesq regulatn" "_b[leglrest] _b[realrest] _b[finrest] > _b[whtknght] _b[bidprem] _b[insthold] _b[size] _b[sizesq] _b[regulatn]", > reps(100) level(95) saving(bsstboot); command: poisson numbids leglrest realrest finrest whtknght bidprem insthol > d size sizesq regulatn statistics: _b[leglrest] _b[realrest] _b[finrest] _b[whtknght] _b[bidprem > ] _b[insthold] _b[size] _b[sizesq] _b[regulatn] (obs=126) Bootstrap statistics Variable | Reps Observed Bias Std. Err. [95% Conf. Interval] ---------+------------------------------------------------------------------- bs1 | 100 .2601464 -.0013745 .1189032 .0242166 .4960762 (N) | .0388509 .4607801 (P) | .0458346 .5240587 (BC) ---------+------------------------------------------------------------------- bs2 | 100 -.1956597 -.032869 .1990078 -.5905343 .1992149 (N) | -.5984617 .1203618 (P) | -.5918766 .1678654 (BC) ---------+------------------------------------------------------------------- bs3 | 100 .07403 -.0186181 .3099659 -.5410096 .6890697 (N) | -.7000712 .5912598 (P) | -.7000712 .5912598 (BC) ---------+------------------------------------------------------------------- bs4 | 100 .4813822 -.001494 .1227905 .2377392 .7250252 (N) | .219607 .7451068 (P) | .219607 .7451068 (BC) ---------+------------------------------------------------------------------- bs5 | 100 -.6776959 -.0821832 .2914152 -1.255927 -.0994649 (N) | -1.255341 -.1870143 (P) | -1.103497 -.0231217 (BC) ---------+------------------------------------------------------------------- bs6 | 100 -.3619913 .0739888 .2937436 -.9448423 .2208598 (N) | -.9247081 .3047988 (P) | -.9801925 .2354836 (BC) ---------+------------------------------------------------------------------- bs7 | 100 .1785026 -.0571057 .1137786 -.0472589 .4042641 (N) | -.1571039 .2886908 (P) | -.0637076 .3363777 (BC) ---------+------------------------------------------------------------------- bs8 | 100 -.0075693 .005312 .0102423 -.0278924 .0127537 (N) | -.0116357 .0222555 (P) | -.0146602 .0164609 (BC) ---------+------------------------------------------------------------------- bs9 | 100 -.0294392 .0223891 .1511903 -.3294335 .2705552 (N) | -.2774422 .24539 (P) | -.2774422 .24539 (BC) ----------------------------------------------------------------------------- N = normal, P = percentile, BC = bias-corrected . #delimit cr delimiter now cr . . * The program produces three bootstrap 100*(1-alpha) confidence intervals . * (1) Regular asymptotic normal: bhat +/- z_alpha/2*se(bhat) . * except instead of using the initial se(bhat) . * we use the standard deviation of bhat from the bootstrap reps . * (2) Percentile method: which orders the bhat(s) from simulations and . * goes from alpha/2 lowest bhat(s) to the alpha/2 highest bhat(s) . * where (s) denotes the s-th bootstrap sample . * (3) Bootstrap-corrected. This works with the pivotal Wald statistic. . * See the manual or a textbook. . * e.g. Efron and Tibsharani (1993, pp.184-188) with a=0 . * This orders the bhats from simulations and . * goes from p1 to the p2 highest . * where p1 and p2 are bias-correction adjustments to alpha/2 and 1-alpha/ > 2 . * Let p1 = Phi(2z0 - z_alpha/2) . * p2 = Phi(2z0 + z_alpha/2) . * z0 measures the median bias in bhat with . * z0 = Phi-inv(fraction of the bhat(s) < bhat) . * And if z0=0 then p1 = alpha/2 and no correction . * (4) Stata does not give the bootstrap t-interval . * e.g. Efron and Tibsharani (1993, pp.160-162) . * e.g. Cameron and Trivedi (1998) pp.164-167 . * For sample s compute t-test(s) = (bhat(s)-bhat) / se(s) . * where bhat is initial bootstrap estimates . * and bhat(s) and se(s) are for sth round. . * Order the t-test(s) statistics and choose the alpha/2 percentiles . * Confidence interval is bhat - t-test(alpha/2 percentile)*se(bhat) . * to bhat + t-test(1-alpha/2 percentile)*se(bhat) . * where se(bhat) is the original standard error. . * . * The first two are usual root-N. . * The 3rd and 4th are preferred asymptotic refinements. . . . ********** PERCENTILE-T . * . * For simplicity this is for the first regressor in the model . * rather than for all regressors . poisson numbids leglrest realrest finrest whtknght bidprem insthold size /* > */ sizesq regulatn Iteration 0: log likelihood = -184.9518 Iteration 1: log likelihood = -184.94833 Iteration 2: log likelihood = -184.94833 Poisson regression Number of obs = 126 LR chi2(9) = 33.25 Prob > chi2 = 0.0001 Log likelihood = -184.94833 Pseudo R2 = 0.0825 ------------------------------------------------------------------------------ numbids | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- leglrest | .2601464 .1509594 1.723 0.085 -.0357286 .5560213 realrest | -.1956597 .1926309 -1.016 0.310 -.5732093 .1818899 finrest | .07403 .2165219 0.342 0.732 -.3503452 .4984053 whtknght | .4813822 .1588698 3.030 0.002 .170003 .7927613 bidprem | -.6776959 .3767373 -1.799 0.072 -1.416087 .0606956 insthold | -.3619913 .4243292 -0.853 0.394 -1.193661 .4696788 size | .1785026 .0600221 2.974 0.003 .0608614 .2961438 sizesq | -.0075693 .0031217 -2.425 0.015 -.0136878 -.0014509 regulatn | -.0294392 .1605682 -0.183 0.855 -.344147 .2852686 _cons | .9860599 .5339202 1.847 0.065 -.0604044 2.032524 ------------------------------------------------------------------------------ . * Get b and se for first regressor in the model . scalar b1=_b[leglrest] . scalar se1=_se[leglrest] . di " coefficient b1 " b1 " standard error se1 " se1 coefficient b1 .26014637 standard error se1 .15095939 . * The following gets rid of the initial data set but keeps matrices etc. . drop _all . * Now use the saved bootstrap coefficients from earlier . use bsstboot (bs: poisson numbids leglrest realrest finrest whtknght) . gen ttest1 = (bs1 - b1)/se1 . * Get the percentiles saved in r(r1) and r(r2) . _pctile ttest1, p(2.5,99.5) . di "lower 2.5 and upper 2.5 percentile of ttest: " r(r1) " and " r(r2) lower 2.5 and upper 2.5 percentile of ttest: -1.4659271 and 2.0493307 . scalar lb1 = b1 + r(r1)*se1 /* Note the plus sign here */ . scalar ub1 = b1 + r(r2)*se1 . di "percentile-t interval lower and upper bounds: (" lb1 "," ub1 ")" percentile-t interval lower and upper bounds: (.03885091,.56951207) . . . ********** VARIATIONS . * . * If model involves more estimation than one command, e.g. two-step estimator > , . * then instead use bstrap command which can call longer program . * Also can use bsample to do bootstrapping oneself . * . * Following does not work . /* #delimit ; > program define bpoisson > if "`1'" == "?" { > global S_1 "leglrest realrest finrest whtknght bidprem insthold size siz > esq regulatn" > exit} > "_b[leglrest] _b[realrest] _b[finrest] > _b[whtknght] _b[bidprem] _b[insthold] _b[size] _b[sizesq] _b[regulatn]", > reps(500) level(95) saving(bsstboot); > * Here as command spans more than one line use #delimit ; > */ . . ********** RELATED COMMANDS . * . * simul is handy way to store results from simulations similar to bstrap . * post is way to incrementally post results to a file from each round. . . . ********** CLOSE OUTPUT . log close