** STATA Sample Program by Colin Cameron
** Program stmle.do October 1999
* To run you need files
* jaggia.asc and
* in your directory
* This program demonstrates maximum likelihood Poisson
* with and without analytical derivatives
* and with and without robust standard errors
*
* For more explanation on basic Stata see stjaggia.do
*
* Note the special use of quotes for temporary variables.
* Use e.g. `mu' not 'mu'
********** DATA DESCRIPTION
*
* The original data are from Sanjiv Jaggia and Satish Thosar, 1993,
* "Multiple Bids as a Consequence of Target Management Resistance"
* Review of Quantitative Finance and Accounting, 447-457.
* This is used e.g. in A.C.Cameron and P.K.Trivedi (1998)
* "Regression Analysis of Count Data", Cambridge University Press, pp.146-151
*
* 1. DOCNO Doc No.
* 2. WEEKS Weeks
* 3. NUMBIDS Count (Dependent Variable)
* 4. TAKEOVER Delta (1 if taken over)
* 5. BIDPREM Bid Premium
* 6. INSTHOLD Institutional Holdings
* 7. SIZE Size measured in billions
* 8. LEGLREST Legal Restructuring
* 9. REALREST Real Restructuring
* 10. FINREST Financial Restructuring
* 11. REGULATN Regulation
* 12. WHTKNGHT White Knight
* and this program will create
* 13. SIZESQ Size Squared
********** CLOSE FILES POSSIBLY OPEN FROM PREVIOUS EXECUTION
clear
capture log close
* capture in front means program continues even if no log file open
********** CREATE OUTPUT FILE
log using stmle.log, replace
di "stmle.do by Colin Cameron: Basic Stata MLE example using Poisson"
********** MEMORY MANAGEMENT
set maxvar 100 width 1000
********** READ DATA
* You need file jaggia.asc in your directory
* Infile: FREE FORMAT WITHOUT DICTIONARY
* As there is space between each observation data is also space-delimited
* free format and then there is no need for a dictionary file
* The following command spans more that one line so use /* and */
infile docno weeks numbids takeover bidprem insthold size leglrest /*
*/ realrest finrest regulatn whtknght using jaggia.asc
********** DATA TRANSFORMATIONS
gen sizesq = size*size
label variable sizesq "size squared"
******** CHECK DATA: DESCRIPTIVE STATISTICS
describe
summarize
********** POISSON REGRESSION
*
* Use /* and */ as command spans two lines
poisson numbids leglrest realrest finrest whtknght /*
*/ bidprem insthold size sizesq regulatn
********** MAXIMUM LIKELIHOOD FOR POISSON
* Based on Weibull example in [R] ml - maximum likelihood
* See especially Stata 6 Manual Reference H-O p.379 and p.388-399
* The possible methods are lf, d0, d1 and d2.
* Here I use d0, d1 and d2 which give log-density etc for each observation.
* I do not use lf which gives the log-likeihood over all observations
* The general form for the command is
* (1) Define the program, log-density, perhaps gradient and perhaps hessian
* program define myprog
* : :
* end
* (2) Run the program providing dependent variable(s) and regressor(s)
* ml model d*** myprog (dep1 dep2 ... = x's for eqn1) (x's for eqn2) ..
* ml maximize
* where d*** called by d0, d1, d1debug, d2 and d2debug
* and the debug variants check anayltical against numerical derivatives
* and for robust standard errors based on A-in*B*A-inv add , robust
* For the Poisson log-density
* ln L = Sum_i {-mu_i + y_i*ln(mu_i) - ln(y_i!)}
* where
* mu_i = exp(x_i'b)
*
* The first-order conditions are
* Sum_i (y_i - exp(x_i'b))x_i = 0
*
* and the second-derivative matrix is
* Sum_i -exp(x_i'b))*x_i*x_i'
* (1) Define the program, log-density, perhaps gradient and perhaps hessian
* The Poisson model is simple as only one index and one dependent variable
* There is one dependent variable y which is stored in $ML_y1
* There is one index which is x'b and for the Poisson model is mu i.e. E[y|x]
* For log-density -mu_i + y_i*ln(mu_i) - ln(y_i!) where mu_i = exp(x_i'b)
* For gradient d/db = (y_i - mu_i)*x_i
* but Stata wands d/dtheta_i = (y_i - mu_i) using theta_i = x_i'b
* For hessian mu_i*x_i*x_i'
* but Stata wants d2/dtheta_i^2 = mu_i using theta_i = x_i'b
* (2) Example is
* ml model d1debug mypois1 (numbids = bidprem)
* ml maximize
* For program debuggin which produces much output
* set trace on
* The following programs are given
* mypois0 use numerical derivatives throughout
* mypois1 use analytical first derivative and numerical second
* mypois1r same as mypois1 but form robust variance matrix
* mypois2 use analytical first and second derivative
* mypois1 same as mypois1 but form robust variance matrix
******* (1) MYPOIS0 PROGRAM: POISSON WITH NUMERICAL DERIVATIVES THROUGHOUT
program define mypois0
version 6.0
args todo b lnf /* Need to use the names todo b and lnf
todo always contains 1 and may be ignored
b is parameters and lnf is log-density */
tempvar theta1 /* create as needed to calculate lf, g, ... */
mleval `theta1' = `b', eq(1) /* theta1 is theta1_i = x_i'b */
local y "$ML_y1" /* create to make program more readable */
tempvar lnyfact mu
quietly gen double `lnyfact' = lnfact(`y')
quietly gen double `mu' = exp(`theta1')
mlsum `lnf' = -`mu' + `y'*ln(`mu') - `lnyfact'
end
******* (1) OUTPUT: POISSON WITH NUMERICAL DERIVATIVES THROUGHOUT
ml model d0 mypois0 (numbids = leglrest realrest finrest whtknght /*
*/ bidprem insthold size sizesq regulatn)
ml maximize
******* (2) MYPOIS1 PROGRAM: POISSON WITH ANAYLTICAL FIRST DERIVATIVES
program define mypois1
version 6.0
args todo b lnf g /* Need to use the names todo b and lnf
todo always contains 1 and may be ignored
b is parameters, lnf is log-density
g is gradient */
tempvar theta1 /* create as needed to calculate lf, g, ... */
mleval `theta1' = `b', eq(1) /* theta1 is theta1_i = x_i'b */
local y "$ML_y1" /* create to make program more readable */
tempvar lnyfact mu
quietly gen double `lnyfact' = lnfact(`y')
quietly gen double `mu' = exp(`theta1')
mlsum `lnf' = -`mu' + `y'*ln(`mu') - `lnyfact'
/* Following code is extra for analytical first derivatives */
if `todo'==0 | `lnf'==. {exit}
tempname d1
mlvecsum `lnf' `d1' = `y' - `mu'
matrix `g' = `d1'
end
******* (2) MYPOIS1 OUTPUT: POISSON WITH ANALYTICAL FIRST DERIVATIVES
ml model d1debug mypois1 (numbids = leglrest realrest finrest whtknght /*
*/ bidprem insthold size sizesq regulatn)
ml maximize
******* (3) MYPOIS1r PROGRAM: ROBUST SE'S FOR POISSON ANAYLTICAL FIRST
program define mypois1r
version 6.0
args todo b lnf g negH g1 /* Need to use the names todo b and lnf
todo always contains 1 and may be ignored
b is parameters, lnf is log-density
g is gradient negH need not be provided here
g1 is to be used for robust se's */
tempvar theta1 /* create as needed to calculate lf, g, ... */
mleval `theta1' = `b', eq(1) /* theta1 is theta1_i = x_i'b */
local y "$ML_y1" /* create to make program more readable */
tempvar lnyfact mu
quietly gen double `lnyfact' = lnfact(`y')
quietly gen double `mu' = exp(`theta1')
mlsum `lnf' = -`mu' + `y'*ln(`mu') - `lnyfact'
/* Following code is extra for analytical first derivatives
and also changed due to robust se's */
if `todo'==0 | `lnf'==. {exit}
tempname d1
quietly replace `g1' = `y' - `mu'
mlvecsum `lnf' `d1' = `g1'
matrix `g' = `d1'
end
******* (3) MYPOIS1r OUTPUT: ROBUST SE'S FOR POISSON ANAYLTICAL FIRST
ml model d1debug mypois1r (numbids = leglrest realrest finrest whtknght /*
*/ bidprem insthold size sizesq regulatn), robust
ml maximize
******* (4) MYPOIS2 PROGRAM: POISSON WITH ANAYLTICAL SECOND DERIVATIVES
program define mypois2
version 6.0
args todo b lnf g negH /* Need to use the names todo b and lnf
todo always contains 1 and may be ignored
b is parameters, lnf is log-density
g is gradient, negH is negative hessian */
tempvar theta1 /* create as needed to calculate lf, g, ... */
mleval `theta1' = `b', eq(1) /* theta1 is theta1_i = x_i'b */
local y "$ML_y1" /* create to make program more readable */
tempvar lnyfact mu
quietly gen double `lnyfact' = lnfact(`y')
quietly gen double `mu' = exp(`theta1')
mlsum `lnf' = -`mu' + `y'*ln(`mu') - `lnyfact'
/* Following code is extra for analytical first derivatives */
if `todo'==0 | `lnf'==. {exit}
tempname d1
mlvecsum `lnf' `d1' = `y' - `mu'
matrix `g' = `d1'
/* Following code is extra for analytical second derivatives */
if `todo'==0 | `lnf'==. {exit}
tempname d11
mlmatsum `lnf' `d11' = `mu'
matrix `negH' = `d11'
end
******* (4) MYPOIS2 OUTPUT: POISSON WITH ANALYTICAL SECOND DERIVATIVES
ml model d2debug mypois2 (numbids = leglrest realrest finrest whtknght /*
*/ bidprem insthold size sizesq regulatn)
ml maximize
******* (5) MYPOIS2r PROGRAM: ROBUST SE'S FOR POISSON ANAYLTICAL SECOND
program define mypois2r
version 6.0
args todo b lnf g negH g1 /* Need to use the names todo b and lnf
todo always contains 1 and may be ignored
b is parameters, lnf is log-density
g is gradient negH need not be provided here
g1 is to be used for robust se's */
tempvar theta1 /* create as needed to calculate lf, g, ... */
mleval `theta1' = `b', eq(1) /* theta1 is theta1_i = x_i'b */
local y "$ML_y1" /* create to make program more readable */
tempvar lnyfact mu
quietly gen double `lnyfact' = lnfact(`y')
quietly gen double `mu' = exp(`theta1')
mlsum `lnf' = -`mu' + `y'*ln(`mu') - `lnyfact'
/* Following code is extra for analytical first derivatives
and also changed due to robust se's */
if `todo'==0 | `lnf'==. {exit}
tempname d1
quietly replace `g1' = `y' - `mu'
mlvecsum `lnf' `d1' = `g1'
matrix `g' = `d1'
/* Following code is extra for analytical second derivatives */
if `todo'==0 | `lnf'==. {exit}
tempname d11
mlmatsum `lnf' `d11' = `mu'
matrix `negH' = `d11'
end
******* (3) MYPOIS2r OUTPUT: ROBUST SE'S FOR POISSON ANAYLTICAL SECOND
ml model d2debug mypois2r (numbids = leglrest realrest finrest whtknght /*
*/ bidprem insthold size sizesq regulatn), robust
ml maximize
******* EXTRAS
* For models with two indexes see Weibull code in Stata 6 manual
********** CLOSE OUTPUT
log close