[R] R help
Bert Gunter
gunter.berton at gene.com
Sun Feb 19 21:55:59 CET 2012
Folks:
Perhaps I am naive, ignorant, or foolish, but this whole discussion
seems rather ridiculous. What possible relation to reality could a
multivariate normal of the size requested have? Even for simulation,
it seems like nonsense.
Cheers,
Bert
On Sun, Feb 19, 2012 at 11:35 AM, Petr Savicky <savicky at cs.cas.cz> wrote:
> On Sat, Feb 18, 2012 at 06:00:53PM -0500, li li wrote:
>> Dear all,
>> I need to generate numbers from multivariate normal with large dimensions
>> (5,000,000).
>
> Hi.
>
> I am replying to your first email, since the other did not arrive
> to my folder, possibly filtered out by a spam filter. I see them
> at the web interface.
>
> 1. Error: cannot allocate vector of size 381.5 Mb
>
> The error message makes sense. The matrix requires m*n*8/2^20 MB,
> which is in your case
>
> m <- 100000
> n <- 500
> m*n*8/2^20
>
> [1] 381.4697
>
> May be, you already have other large objects in the memory. Try to
> minimize the number and size of objects, which you need simultaneously
> in an R session.
>
> 2. Generating a multivariate normal distribution.
>
> As Peter Dalgaard pointed out, a speed up is possible only
> for special types of the covariance matrix Sigma. A general
> way is to find a matrix A such that A A^t = Sigma. Then,
> the vector A X, where X is from N(0,I) and I is an identity
> matrix of an appropriate dimension, has covariance Sigma.
> This is also the way, how mvtnorm package works.
>
> A speed up is possible, if computing the product A X does not
> require to have matrix A explicitly represented in memory.
>
> The matrix A need not be a square matrix. In particular, the
> previous case may be understood as using the matrix A, which
> for a small m is as follows.
>
> m <- 5
> rho <- 0.5
> A <- cbind(sqrt(rho), sqrt(1 - rho)*diag(m))
> A
>
>
> [,1] [,2] [,3] [,4] [,5] [,6]
> [1,] 0.7071068 0.7071068 0.0000000 0.0000000 0.0000000 0.0000000
> [2,] 0.7071068 0.0000000 0.7071068 0.0000000 0.0000000 0.0000000
> [3,] 0.7071068 0.0000000 0.0000000 0.7071068 0.0000000 0.0000000
> [4,] 0.7071068 0.0000000 0.0000000 0.0000000 0.7071068 0.0000000
> [5,] 0.7071068 0.0000000 0.0000000 0.0000000 0.0000000 0.7071068
>
> A %*% t(A)
>
> [,1] [,2] [,3] [,4] [,5]
> [1,] 1.0 0.5 0.5 0.5 0.5
> [2,] 0.5 1.0 0.5 0.5 0.5
> [3,] 0.5 0.5 1.0 0.5 0.5
> [4,] 0.5 0.5 0.5 1.0 0.5
> [5,] 0.5 0.5 0.5 0.5 1.0
>
> This construction is conceptually possible also for a large m because
> the structure of A allows to compute A X by simpler operations with
> the vector X than an explicit matrix product. Namely, the expression
>
> rnorm(1, sd=sqrt(rho)) + rnorm(m, sd=sqrt(1 - rho))
>
> or, more clearly,
>
> sqrt(rho) * rnorm(1) + sqrt(1 - rho) * rnorm(m)
>
> is equivalent to the required A X, where X consists of rnorm(1)
> and rnorm(m) together.
>
> If you have a specific Sigma, describe it and we can discuss,
> whether an appropriate A can be found.
>
> Hope this helps.
>
> Petr Savicky.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Bert Gunter
Genentech Nonclinical Biostatistics
Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
More information about the R-help
mailing list