[R] reshape panel data
Richard Saba
sabaric at auburn.edu
Thu Apr 8 18:55:19 CEST 2010
I have a data set with observations on 549 cities spanning an 18 year
period. However, some of cities did not report in one or more of the 18
years. I would like to implement the procedure suggested by Wooldridge
section 17.1.3 in his "Econometric analysis of cross section and panel data"
to correct for attrition. For example the table below indicates that the 3rd
and the 7th cities in the data set do not have observations for several
years. The Wooldridge procedure requires the generation of a selection
variable that takes on the value of 1 if the city reports in that year and 0
otherwise. How do I assign a zero to a city when it does not have an
observation for that year?
For example. Suppose I have the following data set. The observation range
over three years 1990-1992. But some cities did not report in some years.
The original data looks like this:
Cicoid year other_variables seclection-variable
1 1990 x x x x x x x 1
1 1991 xxxxxxxxxx 1
2 1991 xxxxxxxxxx 1
3 1990 xxxxxxxxxx 1
3 1991 xxxxxxxxxx 1
3 1992 xxxxxxxxxx 1
I would like to get a data set that looks like this:
Cicoid year other_variables seclection-variable
1 1990 x x x x x x x 1
1 1991 xxxxxxxxxx 1
1 1992 ....... 0
2 1990 ........ 0
2 1991 xxxxxxxxxx 1
2 1992 ........ 0
3 1990 xxxxxxxxxx 1
3 1991 xxxxxxxxxx 1
3 1992 xxxxxxxxxx 1
I can reshape the data using STATA with the following three simple commands:
xtset Cicoid year
tsfill ,full
replace selection_variable=0 if selection_variable==.
I proclaim the data as a panel series identifying the ID and TIME index
variables. Then use the time-series fill command.
I have searched the help and vignettes of both the "zoo" and "plm" packages
but cannot find the solution.
Can anyone help? Thanks,
Richard Saba
More information about the R-help
mailing list