[R] unequal number of observations for longitudinal data
Chuck Cleland
ccleland at optonline.net
Sat Jan 27 11:58:00 CET 2007
gallon li wrote:
> i have a large longitudinal data set. The number of observations for each
> subject is not the same across the sample. The largest number of a subject
> is 5 and the smallest number is 1.
>
> now i want to make each subject to have the same number of observations by
> filling zero, e.g., my original sample is
>
> id x
> 001 10
> 001 30
> 001 20
> 002 10
> 002 20
> 002 40
> 002 80
> 002 70
> 003 20
> 003 40
> 004 ......
>
> now i wish to make the data like
>
> id x
> 001 10
> 001 30
> 001 20
> 001 0
> 001 0
> 002 10
> 002 20
> 002 40
> 002 80
> 002 70
> 003 20
> 003 40
> 003 0
> 003 0
> 003 0
> 004 ......
>
> so that each id has exactly 5 observations. is there a function which can
> allow me do this quickly?
Filling in with zeros seems like a bad idea, but here is an approach
to filling in with NAs. I will leave replacing the NAs with zeros to you.
df.long <- data.frame(id = c(1,1,1,2,2,2,2,2,3,3), x = runif(10),
time = c(1,2,5,1,2,3,4,5,2,4))
df.long
id x time
1 1 0.72888215 1
2 1 0.60893548 2
3 1 0.41347690 5
4 2 0.79388248 1
5 2 0.05810054 2
6 2 0.02451654 3
7 2 0.85464775 4
8 2 0.15970365 5
9 3 0.22856183 2
10 3 0.38291471 4
df.wide <- reshape(df, idvar = "id", v.names = "x", direction="wide")
df.wide
id x.1 x.2 x.5 x.3 x.4
1 1 0.6375135 0.1651258 0.3210223 NA NA
4 2 0.9878134 0.8909020 0.9853269 0.7747615 0.3834130
9 3 NA 0.3586109 NA NA 0.8310539
df.long2 <- reshape(df.wide, direction="long")
df.long2
id time x
1.1 1 1 0.6375135
2.1 2 1 0.9878134
3.1 3 1 NA
1.2 1 2 0.1651258
2.2 2 2 0.8909020
3.2 3 2 0.3586109
1.5 1 5 0.3210223
2.5 2 5 0.9853269
3.5 3 5 NA
1.3 1 3 NA
2.3 2 3 0.7747615
3.3 3 3 NA
1.4 1 4 NA
2.4 2 4 0.3834130
3.4 3 4 0.8310539
This assumes that your data in the "long" format has a time variable.
See the help page for reshape() for more details.
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list