[R] Generate multivariate normal data with a random correlation matrix

Rick DeShon deshon at msu.edu
Wed Feb 9 17:19:39 CET 2011


Hi All.

I'd like to generate a sample of n observations from a k dimensional
multivariate normal distribution with a random correlation matrix.

My solution:
1) The lower (or upper) triangle of the correlation matrix has
n.tri=(d/2)(d+1)-d entries.
2) Take a uniform sample of n.tri possible correlations (runi(n.tr,-.99,.99)
3) Populate a triangle of the matrix with the sampled correlations
4) Mirror the triangle to populate the other triangle forming a
symmetric matrix, cormat
5) Sample n observations from a multivariate normal distribution with
mean vector=0 and varcov=cormat


Problem:
This approach violates the triangle inequality property of correlation
matrices.  So, the matrix I've constructed is certainly a valid matrix
but it is not a valid correlation matrix and it blows up when you
submit it to a random number generator such as rmnorm.  With a small
matrix you sometimes get lucky and generate a valid correlation matrix
but as you increase d the probability of obtaining a valid correlation
matrix drops off quickly.

So, any ideas on how to construct a correlation matrix with random
entries that cover the range (or most of the range) or the correlation
[-1,1]?

Here's the code I've used that won't work.
************************************************
library(mnormt)
n <- 1000
d <- 50

n.tri <- ((d*(d+1))/2)-d
r       <- runif(n.tri, min=-.5, max=.5)

cormat <- diag(c)
count1=1
for (i in 1:c){
       for (j in 1:c){
               if (i<j) {
                               cormat[i,j]=r[count1]
                               cormat[j,i]=cormat[i,j]
                               count1=count1+1
                            }
       }
}
eigen(cormat)     # if negative eigenvalue, then the matrix violates
the triangle inequality

x <-  rmnorm(n, rep(0, c), cormat)  # Sample the data



Thanks in advance,

Rick DeShon



More information about the R-help mailing list