[R] select observations from longitudinal data
Bill.Venables at csiro.au
Bill.Venables at csiro.au
Sun Mar 29 11:45:59 CEST 2009
Let's tackle the bigger problem of doing this not just for time = 3 but for all times.
First we start with your data frame:
> dat
id time x
1 1 1 10
2 1 2 11
3 1 3 23
4 1 4 23
5 2 2 12
6 2 3 13
7 2 4 14
8 3 1 11
9 3 3 15
10 3 4 18
11 3 5 21
12 4 2 22
13 4 3 27
14 4 6 29
>
### Now put the data into an id x time matrix, with gaps:
> mat <- with(dat, {
+ lev_id <- sort(unique(id))
+ lev_tm <- sort(unique(time))
+ M <- matrix(NA, length(lev_id), length(lev_tm))
+ dimnames(M) <- list(id = lev_id, time = lev_tm)
+ M[cbind(match(id, lev_id), match(time, lev_tm))] <- x
+ M
+ })
> mat
time
id 1 2 3 4 5 6
1 10 11 23 23 NA NA
2 NA 12 13 14 NA NA
3 11 NA 15 18 21 NA
4 NA 22 27 NA NA 29
>
### Now do the replacements
### (this is a very questionable dodge, by the way!)
> for(i in 2:nrow(mat))
+ if(any(k <- is.na(mat[i-1, ])))
+ mat[i-1, k] <- mat[i, k]
>
> mat
time
id 1 2 3 4 5 6
1 10 11 23 23 NA NA
2 11 12 13 14 21 NA
3 11 22 15 18 21 29
4 NA 22 27 NA NA 29
>
### some gaps cannot be filled.
### now turn it back into a data frame;
### this is a splendid trick that no one knows about:
> dat <- as.data.frame(as.table(mat), responseName = "x")
> dat <- with(dat, dat[order(id, time), ])
>
### this will look OK, but the first two columns are factors
### as we started with numeric variables (if we did) then
### it might be useful to turn them back to numerical variables
### again:
> dat <- within(dat, {
+ id <- as.numeric(as.character(id))
+ time <- as.numeric(as.character(time))
+ })
>
> dat
id time x
1 1 1 10
5 1 2 11
9 1 3 23
13 1 4 23
17 1 5 NA
21 1 6 NA
2 2 1 11
6 2 2 12
10 2 3 13
14 2 4 14
18 2 5 21
22 2 6 NA
3 3 1 11
7 3 2 22
11 3 3 15
15 3 4 18
19 3 5 21
23 3 6 29
4 4 1 NA
8 4 2 22
12 4 3 27
16 4 4 NA
20 4 5 NA
24 4 6 29
>
### As many gaps have been filled as can be filled (with fake data!).
### If you want to remove those still missing, you can use
> dat <- na.omit(dat)
Bill Venables
http://www.cmis.csiro.au/bill.venables/
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of gallon li
Sent: Sunday, 29 March 2009 6:51 PM
To: r-help
Subject: [R] select observations from longitudinal data
Suppose I have a long format for a longitudinal data
id time x
1 1 10
1 2 11
1 3 23
1 4 23
2 2 12
2 3 13
2 4 14
3 1 11
3 3 15
3 4 18
3 5 21
4 2 22
4 3 27
4 6 29
I want to select the x values for each ID when time is equal to 3. When that
observation is not observed, then I want to replace it with the obervation
at time equal to 4. otherwise just use NA.
How can I implement this with a quick command?
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list