[R] Generate a serie of new vars that correlate with existing var
Nguyen Dinh Nguyen
n.nguyen at garvan.org.au
Wed Apr 4 00:51:48 CEST 2007
Dear Greg,
Thanks million!
"As good as it gets" :)
All the best
Nguyen
-----Original Message-----
From: Greg Snow [mailto:Greg.Snow at intermountainmail.org]
Sent: Wednesday, April 04, 2007 1:46 AM
To: Nguyen Dinh Nguyen; r-help at stat.math.ethz.ch
Subject: RE: [R] Generate a serie of new vars that correlate with existing
var
Here is one way to do it:
# create the initial x variable
x1 <- rnorm(100, 15, 5)
# x2, x3, and x4 in a matrix, these will be modified to meet the
criteria
x234 <- scale(matrix( rnorm(300), ncol=3 ))
# put all into 1 matrix for simplicity
x1234 <- cbind(scale(x1),x234)
# find the current correlation matrix
c1 <- var(x1234)
# cholesky decomposition to get independence
chol1 <- solve(chol(c1))
newx <- x1234 %*% chol1
# check that we have independence and x1 unchanged
zapsmall(cor(newx))
all.equal( x1234[,1], newx[,1] )
# create new correlation structure (zeros can be replaced with other r
vals)
newc <- matrix(
c(1 , 0.4, 0.5, 0.6,
0.4, 1 , 0 , 0 ,
0.5, 0 , 1 , 0 ,
0.6, 0 , 0 , 1 ), ncol=4 )
# check that it is positive definite
eigen(newc)
chol2 <- chol(newc)
finalx <- newx %*% chol2 * sd(x1) + mean(x1)
# verify success
mean(x1)
colMeans(finalx)
sd(x1)
apply(finalx, 2, sd)
zapsmall(cor(finalx))
pairs(finalx)
all.equal(x1, finalx[,1])
Hope this helps,
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Nguyen
> Dinh Nguyen
> Sent: Sunday, April 01, 2007 7:47 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Generate a serie of new vars that correlate with
> existing var
>
> Dear R helpers,
> I have a var (let call X1) with approximately Normal
> distribution (say, mean=15, SD=5).
> I want to generate a series of additional vars X2, X3,
> X4...such that the correlation between X2 and X1 is o.4, X3 and
> X1 is 0.5, X4 and X1 is 0.6 and so on with the condition all
> variables X2, X3, X4....have the same mean and SD with X1.
> Any help should be appreciated
> Regards
> Nguyen
>
>
More information about the R-help
mailing list