[R] Random data with correlation

Edwin Velds E.Velds at balpol.tudelft.nl
Fri Mar 8 01:25:50 CET 2002


Hello all.

First of all, I have only been using are a short time and I'm not an
expert in statistics either.

I have the following problem. I'm working with measurements of physical
samples, each measurement has about 4000 variables. I have 33 of those
samples. From those 400 variables I deduced through non-statiscal means
that I needed about 200 of them. I read those into a data.frame with 200
columns of 33 rows. As I expected, quite a few variables have a fairly
high correlation with one another (as high as 0.96). The individual
variables are about normally distributed according to the Shapiro-Wilks
test.

With this data, I need to generate new random samples. However, for them
to be any good they need to have a similar correlation to the original
samples. Searching the net I found that this could be accomplished by
calculating random samples with

 Random-sample = M + R L

where M = vector of means
      R = a vector of snorm values
      L = the upper triangle matrix of the Choleski decomposition of the
covariance matrix.

I verified this with a simple hand made example and it works. But trying
this with the real data got me into trouble. chol() tells me that the
covariance matrix of my test data is not positive definite. Which I
found odd because I was told that a real covariance matrix should be
that.
After some testing I found that qr(cov-matrix)$rank=32 and
qr(cov-matrix, tol=1e-20)$rank=200. Which probably means that the matrix
is numerically badly conditioned.

But now I'm at a loss on how to proceed. Changing the tolerance gave me
an answer there. But in for instance the det() function (using qr), I
can't set the tolerance to pass on to qr and therefore can't get an
answer. Could the reason for chol() failing be related to this too? Or
maybe there is another method of getting what I want by using less
sensitive functions and avoiding the problem altogether?

Any help in this would be greatly appreciated.

Thanks,
Edwin
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list