[R] How to make a lagged variable in panel data?
Ila Patnaik
ila at mayin.org
Sat Aug 13 14:31:23 CEST 2005
Suppose we observe N individuals, for each of which we have a
time-series. How do we correctly create a lagged value of the
time-series variable?
As an example, suppose I create:
A <- data.frame(year=rep(c(1980:1984),3),
person= factor(sort(rep(1:3,5))),
wage=c(rnorm(15)))
> A
year person wage
1 1980 1 0.17923212
2 1981 1 0.25610292
3 1982 1 0.50833655
4 1983 1 -0.42448395
5 1984 1 0.49233532
6 1980 2 -0.49928025
7 1981 2 0.06842660
8 1982 2 0.65677575
9 1983 2 0.15947390
10 1984 2 -0.46585116
11 1980 3 -0.29052635
12 1981 3 -0.27109203
13 1982 3 -0.76168164
14 1983 3 0.02294361
15 1984 3 2.22828032
What I'd like to do is to make a lagged wage for each person, i.e., I
should get an additional variable A$wage.lag1:
> A
year person wage wage.lag1
1 1980 1 0.17923212 NA
2 1981 1 0.25610292 0.17923212
3 1982 1 0.50833655 0.25610292
4 1983 1 -0.42448395 0.50833655
5 1984 1 0.49233532 -0.42448395
6 1980 2 -0.49928025 NA
7 1981 2 0.06842660 -0.49928025
8 1982 2 0.65677575 0.06842660
9 1983 2 0.15947390 0.65677575
10 1984 2 -0.46585116 0.15947390
11 1980 3 -0.29052635 NA
12 1981 3 -0.27109203 -0.29052635
13 1982 3 -0.76168164 -0.27109203
14 1983 3 0.02294361 -0.76168164
15 1984 3 2.22828032 0.02294361
I could think of writing code which does this "by hand", but it struck
me as a fundamental requirement when dealing with panel data, so
perhaps there is high level support for such a task?
I have been trying to learn groupedData objects and the tools that go
with them, but I didn't get a hint about how I would address such a
task.
-Ila
More information about the R-help
mailing list