[R] Creating missingness in repeated measurement data

David Winsemius dwinsemius at comcast.net
Mon Sep 17 21:31:03 CEST 2012


On Sep 17, 2012, at 11:32 AM, john james wrote:

> Dear R users,
>  
> I have the following problems. My dataset (dat) is as follows: 
> 
> a <- c(1,2,3) 
> id <- rep(a, c(3,2,3))
> stat <- c(1,1,0,1,0,1,1,1)
> g <- c(0,0,0,0,0,0,1,0)
> stop <- c(1,2,4,2,4,1,1.5,3)
> dat <- data.frame(id,stat,g,stop)
>  
> I want to creat a new dataset (dat2) with missing values 
> such that when either g = =1 or stat = =0, the remaining rows for an 
> individual subject is set to NA by using a new variable d (that states 
> the exact time this 
> happened from the stop variable). By this I mean dat2 that looks like,
>  
> id <- rep(a, c(3,2,3))
> sta2<- c(1,1,NA,1,NA,1,NA,NA)
> g2<- c(0,0,NA,0,NA,0,NA,NA)
> stop2 <- c(1,2,NA,2,NA,1,NA,NA)
> d <- c(4,4,NA,4,NA,1.5,NA,NA)
>  
> dat2 <- data.frame(id=id, stat2=sta2, g2=g2,stop2=stop2,d=d).

> suppressidx <- ave(dat$stat==0 | dat$g==1, dat$id, FUN=cumsum)
> suppress <- function(col) { ifelse( suppressidx, NA, col)}
> cbind(dat[1], sapply( dat[-1], function(x) suppress(x) ) )
  id stat  g stop
1  1    1  0    1
2  1    1  0    2
3  1   NA NA   NA
4  2    1  0    2
5  2   NA NA   NA
6  3    1  0    1
7  3   NA NA   NA
8  3   NA NA   NA

-- 

David Winsemius, MD
Alameda, CA, USA




More information about the R-help mailing list