[R] Generate multivariate normal data with a random correlation matrix
Rick DeShon
deshon at msu.edu
Wed Feb 9 17:06:08 CET 2011
Hi All.
I'd like to generate a sample of n observations from a k dimensional
multivariate normal distribution with a random correlation matrix.
My solution:
The lower (or upper) triangle of the correlation matrix has
n.tri=(d/2)(d+1)-d entries.
Take a uniform sample of n.tri possible correlations (runi(n.tr,-.99,.99)
Populate a triangle of the matrix with the sampled correlations
Mirror the triangle to populate the other triangle forming a symmetric
matrix, cormat
Sample n observations from a multivariate normal distribution with
mean vector=0 and varcov=cormat
Problem:
This approach violates the triangle inequality property of correlation
matrices. So, the matrix I've constructed is certainly a valid matrix
but it is not a valid correlation matrix and it blows up when you
submit it to a random number generator such as rmnorm. With a small
matrix you sometimes get lucky and generate a valid correlation matrix
but as you increase d the probability of obtaining a valid correlation
matrix drops off quickly.
So, any ideas on how to construct a correlation matrix with random
entries that cover the range (or most of the range) or the correlation
[-1,1]?
Here's the code I've used that won't work.
************************************************
library(mnormt)
n <- 1000
d <- 50
n.tri <- ((d*(d+1))/2)-d
r <- runif(n.tri, min=-.5, max=.5)
cormat <- diag(c)
count1=1
for (i in 1:c){
for (j in 1:c){
if (i<j) {
cormat[i,j]=r[count1]
cormat[j,i]=cormat[i,j]
count1=count1+1
}
}
}
eigen(cormat) # if negative eigenvalue, then the matrix violates
the triangle inequality
x <- rmnorm(n, rep(0, c), cormat) # Sample the data
Thanks in advance,
Rick DeShon
More information about the R-help
mailing list