[R] multiple imputation based on a condition
Gary Collins
collins.gs at gmail.com
Sat May 22 23:55:07 CEST 2010
Any suggestions on the following would be grateful.
I'm trying to impute data, where a fictitional dataset is defined as...
set.seed(110)
n <- 500
test <- data.frame(smoke_status = rbinom(n, 2, 0.6), smoke_amount =
rbinom(n, 2, 0.5), rf1 = rnorm(n), rf2 = rnorm(n), outcome = rbinom(n,
1, 0.3))
# smoke_status (0, 1, 2) is c("non-smoker, "ex-smoker",
"current_smoker"), and
# smoke_amount (0, 1, 2) is c("light", "moderate", "heavy")
# rf1 and rf2 are two other risk factors (for illustration purposes -
real data set has more risk factors)
# artificially NA some of these values
test$smoke_status[sample(1:nrow(test), 60)] <- NA
test$smoke_amount[sample(1:nrow(test), 60)] <- NA
test$rf1[sample(1:nrow(test), 50)] <- NA
test$rf2[sample(1:nrow(test), 50)] <- NA
I'm trying to impute all missing values, but I only want to impute
smoke_amount if smoke_status==2 (i.e. they are a current smoker - makes
no sense to impute smoke_amount if they do not smoke). I can do this in
STATA via the conditional option in ICE, but would prefer to keep this
in R. Any suggestions (if this is feasible via MICE, mi or Amelia)? I
thought the passive imputation approach in MICE would be the way forward
but I've so far been unsuccessful.
Thanks in advance.
Gary
More information about the R-help
mailing list