(Econ/ARE 240F)

Department of Economics , University of California - Davis

Professor Colin Cameron   Bainer 1132

Tuesday Thursday 8.00 - 9.50 am  SSH 1113  (Blue Conference Room)

Office Hours:
Wednesday 2.00 - 4.00 pm   and  Thursday 10.00 am - 11.00 noon 

Teaching Assistant:
Jongkwan Lee 
Office hours: Tuesday 2-3 pm  and Wedenesday 2-3 pm

Pre-requisites: The listed pre-requisite is Econ / ARE 240D.  The essential pre-requisite is Econ / ARE 240D.

Course Goals: The Spring 2016 course includes a survey of Statistical Learning Methods. This will cover many of the methods very briefly. The most important for econometrics include cross-validation, Lasso and regression trees.  Following the survey the course will consider their use in causal econometrics research. The remainder of the course covers various topics.

Brief Course Outline:
Classes 1-10      Statistical learning
Classes 11-12    Statistical learning for causal econometrics
Classes 13-16    Bayesian analysis and multiple imputation
Classes 17-20    Further topics (most likely including clustering) 

For statistical learning the main text is an undergraduate level book
ISL: Gareth James, Daniela Witten, Trevor Hastie and Robert Tibsharani (2013), An Introduction to Statistical Learning: with Applications in R, Springer.
A free legal pdf is at and a $25 hardcopy can be obtained via

Supplementary material on statistical learning will come from the graduate level book
ESL: Trevor Hastie, Robert Tibsharani and Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer.
A free legal pdf is at and a $25 hardcopy can e obtained via

Detailed Course Outline: 

Classes 1-2  Introduction to Statistical Learning
Statistical learning overview (ISL chapters 1, 2.1-2-2)
Getting started in R  (ISL chapter 2.3 and
Linear Regression (ISL chapter 3)
Cross-validation (ISL chapter 5)

Classes 3-5  Linear Model Selection and Regularization
Subset selection (ISL chapter 6.1)
Ridge Regression, Lasso, and LARS (ISL chapter 6.2  and  ESL pp.73-79, 86-93)
Principal Components and Partial Least Squares (ISL chapter 6.3)
High-dimensional Data (ISL chapter 6.4)

Classes 6-7  Flexible Regression Models
Polynomials, Step Functions and Basis Functions (ISL chapter 7.1-7.3)
Splines (ISL chapter 7.4-7.5)
Local Regression (ISL chapter 7.6)
Generalized Additive Models (ISL chapter 7.7)
Regression Trees (ISL chapter 8.1)
Bagging, Random Forests, Boosting (ISL chapter 8.2)

Classes 8-9  Classification and Unsupervised Learning
Logistic Regression and Discriminant Analysis  (ISL chapter 4.1-4.5)
Support Vector Machines (ISL chapter 9.1-9.3)
Unsupervised Learning (ISL chapter 10.1-10.3)

Class 10  Midterm exam

Classes 11-12 Statistical Learning in Econometrics
Double Selection Lasso  Belloni, Chernozhukov and Hansen (JEP Spring 2014 pp.29-50)
Big Data for Econometrics  Varian (JEP Spring 2014 pp.3-28)
Recursive tree partitioning  Athey and Imbens (2015)

Classes 13-17  Bayesian Methods and Multiple Imputation
Bayesian Methods  Cameron and Trivedi (2005), Microeconometrics: Methods and Applications, Chapter 13.1-13.6  plus notes provided
Multiple Imputation   Cameron and Trivedi (2005), Microeconometrics: Methods and Applications, Chapter 13.7, 21.7-27.9

Classes 18-20  CLuster-robust Inference for Regression 

Other Material:
Assignments, data, etc will be posted at the course website at Smartsite under Resources.

Computer Materials:
which is available free. See
Assignments will use STATA.  

Course Grading:
Assignments 50%  Due Thursdays April 7, 21; May 3, 17; June 2.
Midterm 25%     Thursday April 28
Final 25%           Wednesday June 8  10.30am-12.30pm  Material after midterm.

Assignments must be handed in on time, so solutions can be discussed in class and distributed in a timely manner.
No credit for late assignments. All must be done.
Academic integrity is required. What is academic integrity? See the UCD Student Judicial Affairs website
As an exception to their rules, I permit some collaboration with other students in doing assignments, but the work handed in must be your own. Each person must create their own Stata output and write up their own answers. And you are to write on your assignment the name of the person(s) you worked with.
Exams will be closed book. The final exam is comprehensive.