** STATA Sample Program by Colin Cameron
** Program stboot2.do October 1999

* To run you need files 
*    jaggia.asc and 
* in your directory

* This program demonstrates use of 
* - Bootstrap command in Stata 

* Application is to Poisson regression


********** DATA DESCRIPTION
*
* The original data are from Sanjiv Jaggia and Satish Thosar, 1993,
* "Multiple Bids as a Consequence of Target Management Resistance"
* Review of Quantitative Finance and Accounting, 447-457.
* This is used e.g. in A.C.Cameron and P.K.Trivedi (1998)
* "Regression Analysis of Count Data", Cambridge University Press, pp.146-151. 
*
* 1. DOCNO     Doc No.
* 2. WEEKS     Weeks
* 3. NUMBIDS   Count (Dependent Variable)
* 4. TAKEOVER  Delta (1 if taken over)
* 5. BIDPREM   Bid Premium
* 6. INSTHOLD  Institutional Holdings
* 7. SIZE      Size measured in billions
* 8. LEGLREST  Legal Restructuring
* 9. REALREST  Real Restructuring
* 10. FINREST  Financial Restructuring
* 11. REGULATN Regulation
* 12. WHTKNGHT White Knight
* and this program will create
* 13. SIZESQ   Size Squared


********** CLOSE FILES POSSIBLY OPEN FROM PREVIOUS EXECUTION
clear
capture log close   
* capture in front means program continues even if no log file open 


********** CREATE OUTPUT FILE
log using stboot.log, replace
di "stboot.do by Colin Cameron: Stata bootstrap example using Poisson"


********** MEMORY MANAGEMENT
*
set maxvar 100 width 1000
* If need more memory then in Stata give command help memory


********** READ DATA
* You need file jaggia.asc in your directory
* Infile: FREE FORMAT WITHOUT DICTIONARY
* As there is space between each observation data is also space-delimited 
* free format and then there is no need for a dictionary file
* The following command spans more that one line so use /* and */

infile docno weeks numbids takeover bidprem insthold size leglrest /*
   */ realrest finrest regulatn whtknght using jaggia.asc

* To drop off extra blanks (if any) at end of file jaggia.asc
drop if _n>126


********** DATA TRANSFORMATIONS
gen sizesq = size*size
label variable sizesq "size squared"


******** CHECK DATA: DESCRIPTIVE STATISTICS
describe
summarize


********** POISSON REGRESSION 

* Here as command spans more than one line use /* and */


********** BOOTSTRAP POISSON REGRESSION 
*
* First run the initial model once to get parameter estimates
poisson numbids leglrest realrest finrest whtknght bidprem insthold size /*
    */ sizesq regulatn
*
* Second do bootstrap
* Need to give the model, initial parameter estimates (in _b[ ]) and number reps 
* In addition save each rep as one line in a Stata file
* And set random seed so that get same results next time around
*
capture erase bsstboot.dta
set seed 10001
#delimit ;
bs "poisson numbids leglrest realrest finrest whtknght bidprem insthold size 
    sizesq regulatn"   "_b[leglrest] _b[realrest] _b[finrest] 
    _b[whtknght] _b[bidprem] _b[insthold] _b[size] _b[sizesq] _b[regulatn]", 
     reps(100) level(95) saving(bsstboot);
#delimit cr

* The program produces three bootstrap 100*(1-alpha) confidence intervals
* (1) Regular asymptotic normal: bhat +/- z_alpha/2*se(bhat)
*     except instead of using the initial se(bhat) 
*     we use the standard deviation of bhat from the bootstrap reps
* (2) Percentile method: which orders the bhat(s) from simulations and
*     goes from alpha/2 lowest bhat(s) to the alpha/2 highest bhat(s) 
*     where (s) denotes the s-th bootstrap sample
* (3) Bootstrap-corrected. This works with the pivotal Wald statistic.
*     See the manual or a textbook.
*     e.g. Efron and Tibsharani (1993, pp.184-188) with a=0
*     This orders the bhats from simulations and
*     goes from p1 to the p2 highest
*     where p1 and p2 are bias-correction adjustments to alpha/2 and 1-alpha/2
*     Let p1 = Phi(2z0 - z_alpha/2)
*         p2 = Phi(2z0 + z_alpha/2)
*         z0 measures the median bias in bhat with
*         z0 = Phi-inv(fraction of the bhat(s) < bhat)
*     And if z0=0 then p1 = alpha/2 and no correction 
* (4) Stata does not give the bootstrap t-interval
*     e.g. Efron and Tibsharani (1993, pp.160-162)  
*     e.g. Cameron and Trivedi (1998) pp.164-167
*     For sample s compute t-test(s) = (bhat(s)-bhat) / se(s)
*     where bhat is initial bootstrap estimates 
*     and bhat(s) and se(s) are for sth round.
*     Order the t-test(s) statistics and choose the alpha/2 percentiles
*     Confidence interval is bhat - t-test(alpha/2 percentile)*se(bhat)
*                         to bhat + t-test(1-alpha/2 percentile)*se(bhat)  
*     where se(bhat) is the original standard error.
*     
* The first two are usual root-N. 
* The 3rd and 4th are preferred asymptotic refinements.
      
  
********** PERCENTILE-T
*
* For simplicity this is for the first regressor in the model
* rather than for all regressors
poisson numbids leglrest realrest finrest whtknght bidprem insthold size /*
    */ sizesq regulatn
* Get b and se for first regressor in the model
scalar b1=_b[leglrest]
scalar se1=_se[leglrest]
di " coefficient b1  " b1  "  standard error se1  " se1
* The following gets rid of the initial data set but keeps matrices etc.
drop _all
* Now use the saved bootstrap coefficients from earlier
use bsstboot
gen ttest1 = (bs1 - b1)/se1
* Get the percentiles saved in r(r1) and r(r2)
_pctile ttest1, p(2.5,99.5)
di "lower 2.5 and upper 2.5 percentile of ttest: " r(r1) "  and  " r(r2) 
scalar lb1 = b1 + r(r1)*se1    /* Note the plus sign here */
scalar ub1 = b1 + r(r2)*se1 
di "percentile-t interval lower and upper bounds:  (" lb1   ","  ub1 ")" 


********** VARIATIONS
*
* If model involves more estimation than one command, e.g. two-step estimator,
* then instead use bstrap command which can call longer program
* Also can use bsample to do bootstrapping oneself
*
* Following does not work
/* #delimit ;
program define bpoisson
  if "`1'" == "?" {
     global S_1 "leglrest realrest finrest whtknght bidprem insthold size sizesq regulatn"
     exit}
       "_b[leglrest] _b[realrest] _b[finrest] 
    _b[whtknght] _b[bidprem] _b[insthold] _b[size] _b[sizesq] _b[regulatn]", 
     reps(500) level(95) saving(bsstboot);
* Here as command spans more than one line use #delimit ;
*/

********** RELATED COMMANDS
*
* simul is handy way to store results from simulations similar to bstrap
* post is way to incrementally post results to a file from each round.


********** CLOSE OUTPUT
log close