STATA: Time series data
A. Colin Cameron, Dept. of Economics, Univ.
of Calif.
- Davis
LAGS AND CHANGES IN STATA
Suppose we have annual data on variable GDP and we want to
compute lagged GDP, the annual change in GDP and the annual
percentage change in GDP.
One way to compute these is to note that _n denotes the current
observation number, so _n-1 denotes the previous observation number.
Then
generate GDPlag =
GDP[_n-1]
constructs the lagged value of GDP, i.e. the value last year
generate GDPchange = GDP[_n] - GDP[_n-1]
constructs the change in GDP
generate GDPgrowth = 100*(GDP[_n] - GDP[_n-1]) /
GDP[_n-1] constructs the annual percentage change in GDP
Note the formatting - we use square brackets [ ] and _n is
underscore n
Also note that the first observation for GDPlag, GDPchange and
GDPgrowth will be missing since there is no observation zero.
For quarterly data if we wanted the year-on-year percentage change,
for example, we give command
generate GDPgrowth = 100*(GDP[_n] - GDP[_n-4]) /
GDP[_n-4]
LAGS AND CHANGES IN STATA FOLLOWING TSSET
Suppose the dataset has a variable year that takes numeric
values, say, 1985, 1986, 1987, ....
Then we can use command tsset to set a time variable to year and
then use Stata time series operators and commands.
Then
tsset year
sets year as the time variable
generate GDPlag = l.GDP
constructs the lagged value of GDP, i.e. the value last year
generate GDPchange = GDP -
l.GDP
constructs the change in GDP
generate GDPchange = (GDP - l.GDP) / l.GDP
constructs the annual percentage change in GDP
Note that here l. is the letter "el" and stands for lag.
Instead of GDP - l.GDP we could use d.GDP where the letter d stands
for difference.
For quarterly data if we wanted the year-on-year percentage change,
for example, we give command
generate GDPgrowth = 100*(GDP - l4.GDP) / l4.GDP
A time series graph of GDP can be produced using the command
tsline GDP
CONVERTING STRING DATES TO A NUMERIC DATE - DIFFICULT
Dates are often given in data sets as string variables
e.g. "February 1, 1960 " or "2/1/1960"
In order to use Stata time series commands and tsset this needs to
be converted to a number that Stat understands.
And then to have nice output for graphs this number in turn needs to
be given a date format.
As an example, suppose we have string variable named date
formatted as e.g. "2/1/1960"
(1) Convert to a number using the date( ) function
generate date2 = date(date, "MDY")
here MDY as the date string variable was
ordered month, day, year
This yields a number that is the number of days since 1/1/1960
e.g. 2/1/1960 yields 31.
Note that date appears twice - the first is the date function and
the second because our variable happened to be called date.
(2) Since we have monthly data convert this to the number of months
since 1960.
generate date3 = mofd(date2)
(3) date3 can be used immediately in a tsset command, but for proper
dates to appear on graphs we should give a date format.
Here date3 is months since 1960 so we use the %tm format for monthly
data
format %tm date3
(4) Now give commands tsset date3 etcetera
Note that the particulars for steps (1) - (3) will change according
to whether your data is daily, weekly, monthly, quarterly, yearly,
..... and the exact way that they appear in the original data e.g.
"February 1, 1960 " or "2/1/1960".
The Stata video https://www.youtube.com/watch?v=SOQvXICIRNY is very
useful.
For details see the Stata PDF documentation on Date and Time
Functions which you can link to following command help date.
For further information on how to use Stata go to
http://www.econ.ucdavis.edu/faculty/cameron