------------------------------------------------------------------------------------------------------
       log:  c:\Imbook\bwebpage\Section3\mma11p1boot.txt
  log type:  text
 opened on:  18 May 2005, 15:52:55

. 
. ********** OVERVIEW OF MMA11P1BOOT.DO **********
. 
. * STATA Program 
. * copyright C 2005 by A. Colin Cameron and Pravin K. Trivedi 
. * used for "Microeconometrics: Methods and Applications" 
. * by A. Colin Cameron and Pravin K. Trivedi (2005)
. * Cambridge University Press 
. 
. * Chapter 11.3 pages 366-368
. * Bootstrap applied to exponential regression model
. * Provides
. * (1) Bootstrap distribution of beta and t-statistic (Table 11.1)
. * (2) Various statistics from bootstrap (pages 366-8)
. * (3) Bootstrap density of the t-statistic (Figure 11.1) 
. * using generated data (see below)
. 
. * Note: To speed up progam reduce breps - the number of bootstrap replications
. *       But final program should use many repications
. 
. * Note: This program uses ereg which is an old Stata command 
. *       superceded by streg, dist(exp)
. 
. * Note: For bootstrap see also mm07p4boot.do
. *       which has additional commands / ways to bootstrap
. 
. ********** SETUP **********
. 
. set more off

. version 8

.   
. ********** GENERATE DATA **********
. 
. * Model is  y ~ exponential(exp(a + bx + cz))
. * where  x and z are joint normal (1,1,0.1,0.1,0.5)
. * i.e. means 0.1  and  0.1
. *      sd's  0.1  and  0.1 and correln 0.5 (so correln^2 = .25)
. *  variances 0.01 and 0.01 and covariance 0.005
. 
. * Generate data from joint normal
. * Use fact that x is N(mu0.1,0.1)
. *       and z | x is N(0.1 + .05/.1*(x - .1), .01x.75 = .0075)
. *       so that st dev = sqrt(0.0075) = 0.0866025
. 
. set obs 50 
obs was 0, now 50

. set seed 10001

. * Generate x and z bivariate normal
. scalar mu1=0.1

. scalar mu2=0.1

. scalar sig1=0.1

. scalar sig2=0.1

. scalar rho=0.5

. scalar sig12=rho*sig1*sig2 

. gen x = mu1 + sig1*invnorm(uniform()) 

. gen muzgivx = mu2+(sig12/(sig2*sig2))*(x-mu1)

. gen sigzgivx = sqrt(sig2*sig2*(1-rho*rho))

. gen z = muzgivx + sigzgivx*invnorm(uniform()) 

. * To generate y exponential with mean mu=Ey use
. *   Integral 0 to a of (1/mu)exp(-x/mu) dx   by change of variables
. * = Integral 0 to a/mu of exp(-t)dt
. * = incomplete gamma function P(0,a/mu) in the terminology of Stata
. gen Ey = exp(-2.0+2*x+2*z)

. gen y = Ey*invgammap(1,uniform())

. gen logy = log(y)

. 
. * Descriptive Statistics
. summarize

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
           x |        50    .0935209    .1031485  -.1173506   .2778609
     muzgivx |        50    .0967604    .0515742  -.0086753   .1889304
    sigzgivx |        50    .0866025           0   .0866025   .0866025
           z |        50    .1033014    .0909297  -.0885447   .3137469
          Ey |        50    .2114837     .071719   .0945722   .4314067
-------------+--------------------------------------------------------
           y |        50    .2024206    .2237202   .0005293   .9601147
        logy |        50   -2.282336     1.45494  -7.543878  -.0407026

. ereg y x z

Iteration 0:   log likelihood = -84.246434  
Iteration 1:   log likelihood = -80.068104  
Iteration 2:   log likelihood = -79.871694  
Iteration 3:   log likelihood = -79.871338  
Iteration 4:   log likelihood = -79.871338  

Exponential regression -- entry time 0
log expected-time form                            Number of obs   =         50
                                                  LR chi2(2)      =       8.75
Log likelihood = -79.871338                       Prob > chi2     =     0.0126

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   .2670543   1.417339     0.19   0.851    -2.510879    3.044988
           z |   4.663384   1.740712     2.68   0.007     1.251652    8.075117
       _cons |  -2.191619   .2328589    -9.41   0.000    -2.648014   -1.735224
------------------------------------------------------------------------------

. 
. save mma11p1boot, replace
file mma11p1boot.dta saved

. 
. * Write data to a text (ascii) file so can use with programs other than Stata
. outfile y x z using mma11p1boot.asc, replace

. 
. ********** SIMPLE BOOTSTRAP ********** 
. 
. * Stata produces four bootstrap 100*(1-alpha) confidence intervals
. * (N) and (P) have no asymptotic refinement
. * (BC)-(BCA) have asymptotic refinement
. * For details see program mma07p4boot.do
. 
. * Change the following for different number of simulations S
. * From page 399, for testing better to use 999 than 1000
. global breps = 999  /* The number of bootstrap reps used below */ 

. 
. set seed 20001

. 
. * A simple and adequate bootstrap command for the slope coefficients is
. bs "ereg y x z"   "_b[x] _b[z]", reps($breps) level(95)

command:      ereg y x z
statistics:   _bs_1      = _b[x]
              _bs_2      = _b[z]

Bootstrap statistics                              Number of obs    =        50
                                                  Replications     =       999

------------------------------------------------------------------------------
Variable     |  Reps  Observed      Bias  Std. Err. [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _bs_1 |   999  .2670543 -.1885509  1.420956   -2.52135   3.055458   (N)
             |                                        -2.9054   2.696445   (P)
             |                                      -2.590993   2.864327  (BC)
       _bs_2 |   999  4.663384  .0524786  1.939086   .8582302   8.468539   (N)
             |                                       .5006047   8.483892   (P)
             |                                        .231034   8.174835  (BC)
------------------------------------------------------------------------------
Note:  N   = normal
       P   = percentile
       BC  = bias-corrected

. 
. ********** MORE DETAILED BOOTSTRAP **********
. 
. * The following bootstrap also gives standard error at each replication
. * and saves data from replications for further analysis
. 
. * In partiulcar, want to use the percentile-t method, 
. * which provides asymtptotic refinement
. 
. * Stata does not give this. For methods see
. * e.g. Efron and Tibsharani (1993, pp.160-162)  
. * e.g. Cameron and Trivedi (2005) Chapter 11.2.6-11.2.7
. * For sample s compute t-test(s) = (bhat(s)-bhat) / se(s)
. * where bhat is initial estimate 
. * and bhat(s) and se(s) are for sth round.
. * Order the t-test(s) statistics and choose the alpha/2 percentiles
. * which give the critical values for the t-test
. 
. * Implementation requires saving the results from each bootstrap replication 
. * in order to obtain ccritical values from percentiles of bootstrap distribution
. 
. use mma11p1boot.dta, clear

. 
. * Get and store coefficients (b) 
. * for regressors in the original model and data before bootstrap
. quietly ereg y x z

. global bx=_b[x]

. global sex=_se[x]

. global bz=_b[z]

. global sez=_se[z]

. di " Coefficients    bx:  " $bx  "  and bz:  " $bz
 Coefficients    bx:  .26705432  and bz:  4.6633845

. di " Standard error sex:  " $sex "  and sez: " $sez
 Standard error sex:  1.4173391  and sez: 1.7407119

. 
. * Bootstrap and save coeff estimates and se's from each replication
. set seed 20001

. bs "ereg y x z"   "_b[x] _b[z] _se[x] _se[z]", reps($breps) level(95) saving(mma11p1bootreps) repl
> ace

command:      ereg y x z
statistics:   _bs_1      = _b[x]
              _bs_2      = _b[z]
              _bs_3      = _se[x]
              _bs_4      = _se[z]

Bootstrap statistics                              Number of obs    =        50
                                                  Replications     =       999

------------------------------------------------------------------------------
Variable     |  Reps  Observed      Bias  Std. Err. [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _bs_1 |   999  .2670543 -.1885509  1.420956   -2.52135   3.055458   (N)
             |                                        -2.9054   2.696445   (P)
             |                                      -2.590993   2.864327  (BC)
       _bs_2 |   999  4.663384  .0524786  1.939086   .8582302   8.468539   (N)
             |                                       .5006047   8.483892   (P)
             |                                        .231034   8.174835  (BC)
       _bs_3 |   999  1.417339  .0644196  .1718393   1.080131   1.754547   (N)
             |                                       1.234399   1.902349   (P)
             |                                       1.196068   1.742845  (BC)
       _bs_4 |   999  1.740712  .0910103   .186631   1.374478   2.106946   (N)
             |                                       1.542322   2.257937   (P)
             |                                       1.453673   2.058318  (BC)
------------------------------------------------------------------------------
Note:  N   = normal
       P   = percentile
       BC  = bias-corrected

. 
. * Now use the bootstrap estimates
. use mma11p1bootreps, clear
(bootstrap: ereg y x z)

. sum

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       _bs_1 |       999    .0785034    1.420956  -9.431229   4.278278
       _bs_2 |       999    4.715863    1.939086  -1.747643   12.09208
       _bs_3 |       999    1.481759    .1718393   1.145421   2.761842
       _bs_4 |       999    1.831722     .186631   1.387625   2.910449

. * Order comes from  "_b[x] _b[z] _se[x] _se[z]" in earlier bs
. gen bxs = _bs_1

. gen bzs = _bs_2

. gen sexs = _bs_3

. gen sezs = _bs_4

. gen ttestxs = (bxs - $bx)/sexs

. gen ttestzs = (bzs - $bz)/sezs

. 
. **********  (1) TABLE 11.1 (page 367)
. 
. summarize bzs ttestzs, d

                             bzs
-------------------------------------------------------------
      Percentiles      Smallest
 1%    -.3361366      -1.747643
 5%     1.544816      -1.716207
10%     2.270323      -1.366866       Obs                 999
25%     3.570291      -1.205571       Sum of Wgt.         999

50%      4.77197                      Mean           4.715863
                        Largest       Std. Dev.      1.939086
75%     5.970802       10.10243
90%     7.100958       10.42623       Variance       3.760056
95%     7.810663       10.76733       Skewness      -.1344324
99%     9.426978       12.09208       Kurtosis       3.545415

                           ttestzs
-------------------------------------------------------------
      Percentiles      Smallest
 1%     -2.66391      -3.921595
 5%    -1.727528      -3.483456
10%     -1.32364      -3.201425       Obs                 999
25%    -.6209012      -2.975815       Sum of Wgt.         999

50%     .0618649                      Mean           .0261125
                        Largest       Std. Dev.      1.046855
75%     .7034938       2.693856
90%     1.323415       3.087892       Variance       1.095904
95%      1.70558        3.11692       Skewness      -.1596043
99%     2.529097       3.738328       Kurtosis       3.337749

. 
. * Additionally need the 2.5 and 97.5 percentiles not given in summarize, d
. 
. * Coefficient of z
. _pctile bzs, p(2.5,97.5)

. di " Lower 2.5 and upper 2.5 percentile of coeff b for z: " r(r1) "  and  " r(r2) 
 Lower 2.5 and upper 2.5 percentile of coeff b for z: .50060469  and  8.4838924

. 
. * t-statistic for z 
. _pctile ttestzs, p(2.5,97.5)

. di " Lower 2.5 and upper 2.5 percentile of ttest on z: " r(r1) "  and  " r(r2) 
 Lower 2.5 and upper 2.5 percentile of ttest on z: -2.1827998  and  2.0659592

. 
. ********** (2) RESULTS IN TEXT PAGES 366-7 ********** 
. 
. * (2A) Bootstrap standard error estimate (no refinement)
. * These are given earlier in bootstrap table output
. * Equivalently get the standard deviation of bzs
. 
. quietly sum bzs

. scalar bzbootse = r(sd)

. di "Bootstrap estimate of standard error: " bzbootse
Bootstrap estimate of standard error: 1.9390864

. 
. * (2B) Test b3 = 0 using percentile-t method (asymptotic refinement)
. * Use the 2.5% and 97.5% bootstrap critical values for t-statistic for z 
. 
. _pctile ttestzs, p(2.5,97.5)

. di " Lower 2.5 and upper 2.5 percentile of ttest on z: " r(r1) "  and  " r(r2) 
 Lower 2.5 and upper 2.5 percentile of ttest on z: -2.1827998  and  2.0659592

. 
. * (2D) 95% confidence interval with asymptotic refinement
. * Use the preceding critical values
. 
. scalar lbz = $bz + r(r1)*$sez    /* Note the plus sign here */

. scalar ubz = $bz + r(r2)*$sez 

. di " Percentile-t interval lower and upper bounds:  (" lbz   ","  ubz ")" 
 Percentile-t interval lower and upper bounds:  (.86375888,8.2596243)

. 
. * (2B-Var) Variation for symmetric two-sided test on z 
. 
. gen absttestzs = abs(ttestzs)

. _pctile absttestzs, p(95)

. di " Upper 5 percentile of symmetric two-sided test on z: " r(r1) "
 Upper 5 percentile of symmetric two-sided test on z: 2.0775187

. 
. * (2C) Test b3 = 0 without asymptotic refinement
. * Usual Wald test except use bootstrap estimate of standard error
. 
. scalar Wald = ($bz - 0) / bzbootse

. di "Wald statistic using bootstrap standard error: " Wald
Wald statistic using bootstrap standard error: 2.404939

. 
. * (2E) Bootstrap estimate of bias
. * This is given in the earlier bootstrap results table
. * and is explained in the text
. 
. ********** (3) FIGURE 11.1 (p.368) PLOTS ESTIMATED DENSITY OF T-STATISTIC FOR Z
. 
. set scheme s1mono

. label var ttestzs "Bootstrap t-statistic"

. kdensity ttestzs, normal /*
>   */ scale (1.2) plotregion(style(none)) /*
>   */ title("Bootstrap Density of 't-Statistic'") /*
>   */ xtitle("t-statistic from each bootstrap replication", size(medlarge)) xscale(titlegap(*5)) /*
>  
>   */ ytitle("Density", size(medlarge)) yscale(titlegap(*5)) /* 
>   */ legend(pos(11) ring(0) col(1)) legend(size(small)) /*
>   */ legend( label(1 "Bootstrap Estimate") label(2 "Standard Normal")) 

. graph save ch11boot, replace
(file ch11boot.gph saved)

. 
. ********** CLOSE OUTPUT **********
. log close
       log:  c:\Imbook\bwebpage\Section3\mma11p1boot.txt
  log type:  text
 closed on:  18 May 2005, 15:53:47
----------------------------------------------------------------------------------------------------