[R-sig-ME] large dataset - lmer2 vector size specified is too large
Douglas Bates
bates at stat.wisc.edu
Fri Mar 2 15:36:05 CET 2007
On 3/1/07, florian bw <florian.bw at gmail.com> wrote:
> Hi,
>
> I want to fit mRNA expression data to sex.
>
> I have the following values:
> expr: expression value (for gene/person)
> affyID: gene ID
> cephID: person ID
> sex
>
> with 224 genes and 195 persons, therefore 43,680 data points. Both
> with the nlme and the lme4 package i get errors. I tried it with R 2.4
> and 2.5, and the newest package versions.
>
> I have a 64-machine with 8GB RAM. Is the dataset simply too large? I
> already cut it down and would actually be glad if I could do the
> calculation with ~ 8000x250 data points.
>
> Thank you for your help.
>
> Florian Breitwieser
> UNSW Sydney
> Systems Biolgy
>
> ---------------------------------------------------
>
> > sessionInfo()
> R version 2.5.0 Under development (unstable) (2007-02-26 r40806)
> x86_64-unknown-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods"
> [7] "base"
>
> other attached packages:
> lme4 Matrix lattice nlme
> "0.9975-13" "0.9975-11" "0.14-16" "3.1-79"
>
>
> -----------------------------------------------------
>
> > library(lme4)
> > sex.lme <- lmer2(expr ~ affyID*sex + affyID|cephID,data=ds.n)
> Error in vector("double", length) : vector size specified is too large
>
Do you know that this formula is equivalent to
expr ~ 1 + (affyID*sex|cephID)
I think you meant
expr ~ affyID * sex + (affyID|cephID)
but even that formula means that you are estimating 448 fixed effects
for which the model matrix is of size 448 * 43680 * 8 bytes (about 150
MB). In addition you are attempting to estimate
(224 * (224 + 1))/2 = 25200
variance-covariance components from 43680 observations.
I suggest that you reconsider the model specification. The readers of
this list will be able to help with the interpretation of the model
specification if you want to discuss it.
>
> ------------------------------------------------------
>
> > library(nlme)
> > sex.lme <- lme(expr ~ affyID*sex,random=~affyID|cephID,data=ds.n)
> Error: cannot allocate vector of size 4.7 Gb
> > gc()
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 829281 44.3 5714627 305.2 10890793 581.7
> Vcells 1881820 14.4 1034019068 7889.0 1103035467 8415.5
>
>
> Another time I got the following message:
> > sex.lme <- lme(expr ~ affyID*sex,random=~affyID|cephID,data=ds.n)
>
> *** caught segfault ***
> address (nil), cause 'unknown'
>
> Traceback:
> 1: lme.formula(expr ~ affyID * sex, random = ~affyID | cephID, data = ds.n)
> 2: lme(expr ~ affyID * sex, random = ~affyID | cephID, data = ds.n)
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list