[R-sig-ME] Too big for lmer?

Fri Aug 7 11:40:05 CEST 2009

>>>>> "KB" == Ken Beath <ken at kjbeath.com.au>
>>>>>     on Fri, 7 Aug 2009 18:44:37 +1000 writes:

    KB> This will fit using a 64bit version of R, but unless there is more  
    KB> than the 4GB of memory I have it will run slowly.

    KB> Like Thierry I wonder if you really want a 6000 level fixed effect.

yes, indeed.

Note however that Doug Bates and I gave talks at useR! 2009 in
Rennes and  DSC 2009 in Kopenhagen, 
  --> http://matrix.r-forge.r-project.org/slides/
where
-  we have mentioned that working with sparse model matrices
  has become much easier;
  --> indeed the latest (*-30) version of Matrix now provides a
  function
	sparse.model.matrix()

	  [ in the future hopefully to be deprecated by a base R
    	    model.matrix(.....,  sparse=TRUE)         option ]

 which allows to directly produce a sparse design matrix from a
 formula and model.frame / data.frame

- we used a somewhat interesting case  with n ~ 70'000
  of non-perfectly nested  students / teachers  data with
  student random effect (~ 3000 levels) but 
  teacher fixed effect (1128 levels),  
  something which is IIRC too large for ca. 1 GB RAM,  but
  just barely works with 2 GB or so.

  Anyway, here we've used the new lmer2(...., sparseX = TRUE) 
  code in the not-released,  but Rforge-available  "lme4a" package
  ("a" : was formerly called "allcoef"-branch of lme4).
  which did allow to circumvent memory problems, as now, 
  both X (fixed effects) and  Z (random effects)  where sparse
  matrices.
  Note the very last slide of the Kopenhagen talk has a nice
  plot of  fixed  vs  random effects for teachers  which shows
  that
  1) yes, the random effects are "just" shrinked version of the f.eff.
  2) but: the ordering *is* changed to some extent, and if you
           want to *rank* the teachers, this can be of
    considerable "political" importance.

Best regards,
Martin Maechler,  ETH Zurich

    KB> Ken

    KB> On 07/08/2009, at 4:56 AM, Kevin W wrote:

    >> I have a simple model that appears to be too big for lmer (using 2GB  
    >> of
    >> memory).  I _can_ fit the model with asreml, but I would like to  
    >> make a
    >> comparison with lmer. Simulated data is used below, but I have real  
    >> data
    >> this causing the same problem.
    >> 
    >> set.seed(496789)
    >> dat <- data.frame(H=sample(1:51, 24000, replace=TRUE),
    >> L=sample(1:6101, 24000, replace=TRUE))
    >> Heff <- rnorm(51, sd=sqrt(40))
    >> Leff <- rnorm(6101, sd=sqrt(1200))
    >> err <- rnorm(24000, sd=10)
    >> dat$y <- 100+Heff[dat$H] + Leff[dat$L] + err
    >> dat <- transform(dat, H=factor(H), L=factor(L))
    >> str(dat)
    >> bwplot(y~H, dat)  # Looks right
    >> 
    >> Using asreml recovers the variance components almost exactly:
    >> 
    >> m1 <- asreml(y~1, data=dat, sparse=~L, random=~H)
    >> 
    >> summary(m1)$varcomp
    >> component std.error   z.ratio constraint
    >> H           50.96058 10.249266  4.972121   Positive
    >> R!variance 100.07324  1.056039 94.762853   Positive
    >> 
    >> Now try lmer:
    >> 
    >> m0 <- lmer(y~1+L+(1|H), data=dat)
    >> 
    >> Error: cannot allocate vector of size 1.1 Gb
    >> In addition: Warning messages:
    >> 1: In model.matrix.default(mt, mf, contrasts) :
    >> Reached total allocation of 1535Mb: see help(memory.size)
    >> 
    >> Am I pushing lmer past its limits (given the 2GB of memory) or is  
    >> there a
    >> way to make this fit?
    >> 
    >> 
    >> Kevin Wright
    >> 
    >> [[alternative HTML version deleted]]