[R] Too many columns with prelim.norm

(Ted Harding) Ted.Harding at manchester.ac.uk
Thu Jun 17 11:29:23 CEST 2010


On 17-Jun-10 05:42:42, rockclimber112358 at gmail.com wrote:
> -----Original Message-----
> From: rockclimber112358 at gmail.com
> Date: Wed, 16 Jun 2010 23:38:49 
> To: <r-help at r-project.org>
> Reply-To: rockclimber112358 at gmail.com
> Subject: Too many columns with prelim.norm
> 
> Hi everyone,
> 
> I'm trying to use prelim.norm with a "big" matrix (36 columns by 10000
> or so rows).  I found that prelim.norm has a built-in limit of 30
> columns, but I'd still really like to use it for my data.  Does anyone
> know of a different way to do the same thing?  Or, would it be easier
> to try to modify the source code?  Thanks!
> 
> Josh

Modifying the source code would not be at all easy! The reason for the
limitation is that prelim.norm (and subsequently norm itself) stores
the pattern of missingness in any row of the data in a 32-bit integer,
as a pattern of 0s and 1s, i.e. (in effect) as sum(2^(miss-1)) where
"miss" is a list of which columns (1,2,...) have missing values (with
0 where there is observed data). These integers are then made into
a vector, for all the rows of data

This is then processed in FORTRAN, and crops up all over the place
in the FORTRAN code underlying prelim.norm and em.norm etc.

So you're not going to be able to work round it without a massive
re-writing of the source. There is no "trick" for applying these
functions to more than 30 columns (e.g. if you knew that only a
certain subset of the columns can have missing values).

There is the same limitation on the number of columns for 'cat',
and for each of the number of columns of categorical data and the
number of columns of continuous data for 'mix'.

If you want to do MI on more than 30 columns of continuous data,
then you should consider one of the other R packages for MI.
I leave it to others more familiar with these to make suggestions!

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 17-Jun-10                                       Time: 10:29:20
------------------------------ XFMail ------------------------------



More information about the R-help mailing list