Based on (Econ/ARE 240F)

**SLIDES: MACHINE LEARNING FOR MICROECONOMETRICS
**

**trmachinelearningseminar.pdf
**

Based on class notes for ECN 240F in Spring 2016, in turn based
on the two statistical learning books by Hastie, Tibsharani and
coauthors.

Then presented over two seminars at University of
Sydney April 2017.

**Abstract:** These slides attempt to explain machine
learning to empirical economists familiar with regression
methods. The slides cover standard machine learning methods for
prediction such as k-fold cross-validation, lasso, regression
trees and random forests. The slides conclude with some recent
econometrics research that incorporates machine learning methods
in causal models estimated using observational data,
specifically (1) IV with many instruments, (2) OLS in the
partial linear model with many controls, and (3) ATE in
heterogeneous effects model with many controls.

For statistical learning the main text used in 240F is an
undergraduate / masters level book

**ISL: **Gareth James, Daniela Witten, Trevor Hastie and
Robert Tibsharani (2013), **An Introduction to Statistical
Learning: with Applications in R**, Springer.

A free legal pdf is at http://www-bcf.usc.edu/~gareth/ISL/
and a $25 hardcopy can be obtained via
http://www.springer.com/gp/products/books/mycopy

Supplementary material on statistical learning came from the
Ph.D. level book

**ESL: **Trevor Hastie, Robert Tibsharani and Jerome
Friedman (2009), The Elements of Statistical Learning: Data
Mining, Inference and Prediction, Springer.

A free legal pdf is at
http://statweb.stanford.edu/~tibs/ElemStatLearn/index.html
and a $25 hardcopy can be obtained via
http://www.springer.com/gp/products/books/mycopy

Bradley Efron and Trevor Hastie (2016)

Victor Chernozhukov http://web.mit.edu/~vchern/www/ https://faculty.fuqua.duke.edu/~abn5/belloni-index.html

Alex Belloni https://faculty.fuqua.duke.edu/~abn5/belloni-index.html

Christian Hansen http://faculty.chicagobooth.edu/christian.hansen/research/

Susan Athey https://www.gsb.stanford.edu/faculty-research/faculty/susan-athey https://people.stanford.edu/athey/research

Guido Imbens https://www.gsb.stanford.edu/faculty-research/faculty/guido-w-imbens https://people.stanford.edu/imbens/publications

ONLINE COURSES

This is a very active area: All the papers below were published in 2012 or later.

Partial Survey focused on using LASSO: A. Belloni, V. Chernozhukov and C. Hansen: 54. "High-Dimensional Methods and Inference on Treatment and Structural Effects in Economics, " J. Economic Perspectives Spring 2014, pp.29-50 with Stata and Matlab programs here; and Stata replication code here

Lasso and IV: A. Belloni, V. Chernozhukov, D. Chen, and C. Hansen. "Sparse Models and Methods for Instrumental Regression, with an Application to Eminent Domain", Arxiv 2010, Econometrica 2012, pp.2369-2429.

Lasso and control function: A. Belloni, V. Chernozhukov and C. Hansen: "Inference on Treatment Effects After Selection Among High-Dimensional Controls," The Review of Economic Studies 2014, p.608-650.

Lasso and Propensity score weighting: M. Farrell, "Robust Inference on Average Treatment effects with possibly more Covariates than Observations," Journal of Econometrics, 2015, vol.189, pp.1-23.

H. Varian Big Data: New Tricks for Econometrics J. Economic Perspectives Spring 2014, pp. 3-28.

Dataset can be obtained from https://www.aeaweb.org/articles.php?doi=10.1257/jep.28.2

Other papers by Chernozhukov and coauthors on this topic are at http://www.mit.edu/~vchern/#veryhigh

G. Imbens and S. Athey "Machine Learning Methods for Estimating Heterogeneous Causal Effects"

Brief overview paper by S. Athey "Machine Learning and Causal Inference for Policy Evaluation" http://faculty-gsb.stanford.edu/athey/documents/AtheyKDDfinal.pdf

Other papers by Athey are at http://faculty-gsb.stanford.edu/athey/research.html#Econometric_Theory_%28Identification_and_E