STATA: Time series data

A. Colin Cameron, Dept. of Economics, Univ. of Calif. - Davis

LAGS AND CHANGES IN STATA

Suppose we have annual data on variable GDP and we want to compute lagged GDP, the annual change in GDP and the annual percentage change in GDP.
One way to compute these is to note that _n denotes the current observation number, so _n-1 denotes the previous observation number.
Then
   generate GDPlag = GDP[_n-1]                           constructs the lagged value of GDP, i.e. the value last year
   generate GDPchange = GDP[_n] - GDP[_n-1]   constructs the change in GDP
   generate GDPgrowth = 100*(GDP[_n] - GDP[_n-1]) / GDP[_n-1]  constructs the annual percentage change in GDP
Note the formatting - we use square brackets [ ] and _n  is underscore n
Also note that the first observation for GDPlag, GDPchange and GDPgrowth will be missing since there is no observation zero.
For quarterly data if we wanted the year-on-year percentage change, for example, we give command
   generate GDPgrowth = 100*(GDP[_n] - GDP[_n-4]) / GDP[_n-4]  

LAGS AND CHANGES IN STATA FOLLOWING TSSET

Suppose the dataset has a variable year that takes numeric values, say, 1985, 1986, 1987, ....
Then we can use command tsset to set a time variable to year and then use Stata time series operators and commands.
Then
  tsset year                                                           sets year as the time variable
  generate GDPlag = l.GDP                               constructs the lagged value of GDP, i.e. the value last year
  generate GDPchange = GDP - l.GDP               constructs the change in GDP
  generate GDPchange = (GDP - l.GDP) / l.GDP    constructs the annual percentage change in GDP
Note that here l. is the letter "el" and stands for lag.
Instead of GDP - l.GDP we could use d.GDP where the letter d stands for difference.
For quarterly data if we wanted the year-on-year percentage change, for example, we give command
   generate GDPgrowth = 100*(GDP - l4.GDP) / l4.GDP
A time series graph of GDP can be produced using the command
  tsline GDP

CONVERTING STRING DATES TO A NUMERIC DATE - DIFFICULT

Dates are often given in data sets as string variables  e.g. "February 1, 1960 "  or  "2/1/1960"
In order to use Stata time series commands and tsset this needs to be converted to a number that Stat understands.
And then to have nice output for graphs this number in turn needs to be given a date format.

As an example,  suppose we have string variable named date formatted as e.g. "2/1/1960"
(1) Convert to a number using the date( ) function
   generate date2 = date(date, "MDY")      here MDY as the date string variable was ordered month, day, year
This yields a number that is the number of days since 1/1/1960  e.g. 2/1/1960 yields 31.
Note that date appears twice - the first is the date function and the second because our variable happened to be called date.
(2) Since we have monthly data convert this to the number of months since 1960.
   generate date3 = mofd(date2)
(3) date3 can be used immediately in a tsset command, but for proper dates to appear on graphs we should give a date format.
Here date3 is months since 1960 so we use the %tm format for monthly data
    format %tm date3
(4) Now give commands tsset date3   etcetera

Note that the particulars for steps (1) - (3) will change according to whether your data is daily, weekly, monthly, quarterly, yearly, ..... and the exact way that they appear in the original data e.g. "February 1, 1960 "  or  "2/1/1960".
The Stata video https://www.youtube.com/watch?v=SOQvXICIRNY is very useful.
For details see the Stata PDF documentation on Date and Time Functions which you can link to following command help date. 

For further information on how to use Stata go to
   http://www.econ.ucdavis.edu/faculty/cameron