[R-sig-ME] Model specification help

Fri Mar 9 17:39:01 CET 2007

Update: I decided to run the lmer and lmer2 versions of the code you 
suggested simultaneously, on two machines:

> grades.lmer<-lmer(grade.pt ~ (1|stud.id) + (1|instr.id) + (1|cour.dep),
+                   newgrades.stripped.df,
+                   control =
+                   list(gradient = FALSE, niterEM = 0, msVerbose = 1)
+                   )

They are still working their way through, but I thought it was interesting 
that (a) lmer2 seems to be using less RAM, by roughly 0.3G; (b) even lmer 
seems well within the 3G limit, maxing out at about 1.6G so far; and (c) 
for the first iteration, there are both similarities and differences:

lmer:    0  3.66128e+06: 0.0865649 0.0125233 0.000161387
lmer2:   0  3.66128e+06: 0.294219 0.111907 0.0127038

(since I don't know what that diagnostic means, I can't determine whether 
to be worried about the difference or not!)

More when the models finish.

Andy

----------------------------------------------------------------------
Andrew J Perrin - andrew_perrin (at) unc.edu - http://perrin.socsci.unc.edu
Assistant Professor of Sociology; Book Review Editor, _Social Forces_
University of North Carolina - CB#3210, Chapel Hill, NC 27599-3210 USA
New Book: http://www.press.uchicago.edu/cgi-bin/hfs.cgi/00/178592.ctl

On Fri, 9 Mar 2007, Douglas Bates wrote:

> On 3/8/07, Andrew Perrin <clists at perrin.socsci.unc.edu> wrote:
>> On Thu, 8 Mar 2007, elw at stderr.org wrote:
>> 
>> >
>> >> Thank you for this. I will return to it tomorrow and let you know how it
>> >> goes. As for the machine it's running on: it's a dual-Xeon 2.8Ghz IBM
>> >> eseries server with 6GB RAM, running debian Linux, kernel 2.6.18.  So 
>> the
>> >> 3GB per-process memory limit applies. I also have access to a shared 
>> server
>> >> with "twenty-four 1.05 GHz Ultra-Sparc III+ processors and 40 GB of main
>> >> memory" running solaris, if that's better.
>> >
>> > Andrew,
>> >
>> > That gets you onto a 64-bit platform, beyond the 32-bit-Intel 4GB memory 
>> (3G
>> > for user process, 1G for OS kernel) limit, and beyond a bunch of other 
>> data
>> > size limits.  The memory bandwidth available to you on the Solaris 
>> machine is
>> > also likely to be much more significant - something that you will find 
>> quite
>> > pleasant for even some more trivial analyses.  :)
>> >
>> > Much better, certainly!  [And very much like what 'beefy' R code is most
>> > frequently run on...]
>> >
>> > W.r.t. the eSeries server you're commonly running on now - if you can 
>> have
>> > your systems people check to make sure that you have a PAE-enabled linux
>> > kernel running, you might be able to muscle past the 3GB mark with a 
>> single R
>> > process.... with some work.
>> >
>> > [If the machine can actually "see" all 6GB of memory, you probably have a 
>> PAE
>> > kernel.]
>> >
>> > --e
>> >
>> 
>> Ironically enough, I *am* the systems people for the eSeries.... having
>> been a unix sysadmin and perl programmer before cutting and running for
>> social science :).. The kernel is PAE enabled, but that only helps with
>> seeing 6G altogether, not over 3G for a single process. I toyed with the
>> idea of whether I could break down the process into several threaded ones,
>> but that's way above my head.
>> 
>> (The Solaris cluster is university-run, though.)
>
> I haven't done a thorough analysis of the memory usage in lmer but I
> can make some informed guesses as to where memory can be saved.  The
> details of the implementation and the slots in the internal
> representation of the model are given in the "Implementation" vignette
> in the lme4 package.  At present there is only one small example shown
> in there but I will add others.
>
> For the model fitting process itself the largest object needed is the
> symmetric sparse matrix in the A slot and the Cholesky factor of the
> updated A*.  The dimension of that square matrix is the sum of the
> sizes of the random effects vector and the fixed effects vector plus 1
> (for the response).  Generally the Cholesky factor will be slightly
> larger than the A but care is taken to make the Cholesky factor as
> small as possible.
>
> I enclose an example from fitting a model with two random effects per
> student, one random effect per teacher and two random effects per
> school to the star (Tennessee's Student-Teacher Achievement Ratio
> study) data.  The dimension of the random effects will be 2*10732 +
> 1374 + 2 * 80 so that easily dominates the dimension of A.
>
> In this case the sizes of the slots L, A, ZXyt and frame are
> comparable.  However, if we strip things down to the bare essentials
> we don't need ZXyt, frame, flist, offset and weights after the matrix
> A has been constructed.
>
> The dimension of the matrices L and A is dominated by the dimension of
> the random effects vector.  The dimension of ZXyt, etc. involves the
> number of observations.  This might be good news in your case in that
> the sizes of the parts that must be preserved are dominated by the
> number of students and not the number of grades recorded.
>