[R] Covariance of data with missing values.
(Ted Harding)
Ted.Harding at manchester.ac.uk
Thu Aug 16 01:10:36 CEST 2007
On 15-Aug-07 21:16:32, Rolf Turner wrote:
>
> I have a data matrix X (n x k, say) each row of which constitutes
> an observation of a k-dimensional random variable which I am willing,
> if not happy, to assume to be Gaussian, with mean ``mu'' and
> covariance matrix ``Sigma''. Distinct rows of X may be assumed to
> correspond to independent realizations of this random variable.
>
> Most rows of X (all but 240 out of 6000+ rows) contain one or more
> missing values.
> [...]
One question, Rolf: How big is k (no of columns)?
If it's greater than 30, you may have problems with 'norm', since the
function prelim.norm() builds up its image of the places where there
are missing values as "packed integers" with code:
r <- 1 * is.na(x)
....
mdp <- as.integer((r %*% (2^((1:ncol(x)) - 1))) + 1)
i.e. 'x' would be nxk and have 1s where your X had missing, 0s elsewhere.
Then each row of 'x' is converted into a 32-bit integer whose "1" bits
correspond to the 1s in 'x'. You'll get "NA" warnings if k>30, and
things could go wrong!
In that case, I hope Chuck's suggestion works!
Best wishes,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 16-Aug-07 Time: 00:10:33
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list