[R] md.pattern ('mice') failure with more than 31 variables

Joshua Wiley jwiley.psych at gmail.com
Tue Nov 29 11:35:12 CET 2011


On Tue, Nov 29, 2011 at 1:58 AM,  <saschaview at gmail.com> wrote:
> Hello
>
> How come that the function md.pattern() from package 'mice' delivers a
> warning when run over data sets with more than 31 variables?

Because 2^31 is too large of a value to be represented as an integer.
The 15th line of md.pattern has the code:

 mdp <- as.integer((r %*% (2^((1:ncol(x)) - 1))) + 1)

when ncol(x) > 31, 32+ - 1 = 31+, and

as.integer(2^31)

returns NA and gives the warning you see.  Technically, the warning
does not occur at the 2^... part, it is when the results are converted
to integer, so if there were no missing values, r (a 0/1 matrix
indicating whether a particular cell is missing) would be all zeros,
and thus r %*% potentially larger value than 2^30 = 0, and you do not
get any warnings.

Aside from some storage inefficiency for < 31 columns, I do not see
any harm from from simply removing the conversion to integer.  For <
31 columns, the function appears to give equal results with or without
the conversion, but for > 31 columns, some patterns are not included
when as.integer is used.

Cheers,

Josh

>
> library( 'mice' )
> x <- as.data.frame(
>  matrix(
>    sample( c(1:3, 1:3, 1:3, NA), 7000, repl=TRUE ),
>    ncol=35,
>    dimnames=list(NULL,
>      paste('V', 11:45, sep="")
>    )
>  )
> )
>
> md.pattern(x) # Warning message: In md.pattern(x) : NAs introduced by
> coercion
> md.pattern(x[, 1:31]) # fine
>
> Thanks, *S*
>
> --
> Sascha Vieweg, saschaview at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/



More information about the R-help mailing list