[R] multiple imputation based on a condition
    Gary Collins 
    collins.gs at gmail.com
       
    Sat May 22 23:55:07 CEST 2010
    
    
  
Any suggestions on the following would be grateful.
I'm trying to impute data, where a fictitional dataset is defined as...
set.seed(110)
n <- 500
test <- data.frame(smoke_status = rbinom(n, 2, 0.6), smoke_amount = 
rbinom(n, 2, 0.5), rf1 = rnorm(n), rf2 = rnorm(n), outcome = rbinom(n, 
1, 0.3))
# smoke_status (0, 1, 2) is c("non-smoker, "ex-smoker", 
"current_smoker"), and
# smoke_amount (0, 1, 2) is c("light", "moderate", "heavy")
# rf1 and rf2 are two other risk factors (for illustration purposes - 
real data set has more risk factors)
# artificially NA some of these values
test$smoke_status[sample(1:nrow(test), 60)] <- NA
test$smoke_amount[sample(1:nrow(test), 60)] <- NA
test$rf1[sample(1:nrow(test), 50)] <- NA
test$rf2[sample(1:nrow(test), 50)] <- NA
I'm trying to impute all missing values, but I only want to impute 
smoke_amount if smoke_status==2 (i.e. they are a current smoker - makes 
no sense to impute smoke_amount if they do not smoke).  I can do this in 
STATA via the conditional option in ICE, but would prefer to keep this 
in R.  Any suggestions (if this is feasible via MICE, mi or Amelia)? I 
thought the passive imputation approach in MICE would be the way forward 
but I've so far been unsuccessful.
Thanks in advance.
Gary
    
    
More information about the R-help
mailing list