[R] mice: selecting small subset of variables to impute from dataset with many variables (> 2500)

Ian McPhail |vmcph@|| @end|ng |rom gm@||@com
Thu Jul 14 19:59:15 CEST 2022


I am looking for some advice on how to select subsets of variables for
imputing when using the mice package.

>From Van Buuren's original mice paper, I see that selecting variables to be
'skipped' in an imputation can be written as:

ini <- mice(nhanes2, maxit = 0, print = FALSE)
pred <- ini$pred
pred[, "bmi"] <- 0
meth <- ini$meth
meth["bmi"] <- ""

With the last two lines specifying the the "bmi" variable gets skipped over
and not imputed.

And I have come across other examples, but all that I have seen lay out a
method of skipping variables where EVERY variable is named (as "bmi" is
named above). I am wondering if there is a reasonably easy way to select
out approximately 30 variables for imputation from a larger dataset with
around 2500 variables, without having to name all 2450+ other variables.

Thank you,


	[[alternative HTML version deleted]]

More information about the R-help mailing list