[R] Generate multivariate normal data with a random correlation matrix
Eik Vettorazzi
E.Vettorazzi at uke.uni-hamburg.de
Wed Feb 9 17:53:25 CET 2011
Hi Rick,
you can generate an arbitrary matrix X and calculate X'X, which is at
least positive semi-definite (and definite, when X has full rank)
X<-matrix(rnorm(16),4)
covmat<-t(X)%*%X
#check
!any(eigen(covmat)$value<0)
#corresponding correlation matrix
cov2cor(covmat)
Random multivariate can be generated with 'mvrnorm' from the MASS-package.
hth.
Am 09.02.2011 17:19, schrieb Rick DeShon:
> Hi All.
>
> I'd like to generate a sample of n observations from a k dimensional
> multivariate normal distribution with a random correlation matrix.
>
> My solution:
> 1) The lower (or upper) triangle of the correlation matrix has
> n.tri=(d/2)(d+1)-d entries.
> 2) Take a uniform sample of n.tri possible correlations (runi(n.tr,-.99,.99)
> 3) Populate a triangle of the matrix with the sampled correlations
> 4) Mirror the triangle to populate the other triangle forming a
> symmetric matrix, cormat
> 5) Sample n observations from a multivariate normal distribution with
> mean vector=0 and varcov=cormat
>
>
> Problem:
> This approach violates the triangle inequality property of correlation
> matrices. So, the matrix I've constructed is certainly a valid matrix
> but it is not a valid correlation matrix and it blows up when you
> submit it to a random number generator such as rmnorm. With a small
> matrix you sometimes get lucky and generate a valid correlation matrix
> but as you increase d the probability of obtaining a valid correlation
> matrix drops off quickly.
>
> So, any ideas on how to construct a correlation matrix with random
> entries that cover the range (or most of the range) or the correlation
> [-1,1]?
>
> Here's the code I've used that won't work.
> ************************************************
> library(mnormt)
> n <- 1000
> d <- 50
>
> n.tri <- ((d*(d+1))/2)-d
> r <- runif(n.tri, min=-.5, max=.5)
>
> cormat <- diag(c)
> count1=1
> for (i in 1:c){
> for (j in 1:c){
> if (i<j) {
> cormat[i,j]=r[count1]
> cormat[j,i]=cormat[i,j]
> count1=count1+1
> }
> }
> }
> eigen(cormat) # if negative eigenvalue, then the matrix violates
> the triangle inequality
>
> x <- rmnorm(n, rep(0, c), cormat) # Sample the data
>
>
>
> Thanks in advance,
>
> Rick DeShon
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf
Martinistr. 52
20246 Hamburg
T ++49/40/7410-58243
F ++49/40/7410-57790
More information about the R-help
mailing list