[R-sig-ME] large dataset
dbp at uiuc.edu
Wed Jan 31 00:16:52 CET 2007
I'm attempting to fit a crossed random effects model to a rather large
data set. This is EU parliament voting data (the response variable is
binary) from 574 legislators over 2123 votes. EU parliamentarians
miss a lot of votes so there are ~700,000 total observations. The
model also includes quite a few covariates---on the order of 30-50
(mostly fixed effects for country, party, etc), depending on the
particular specification. I'm having some serious issues fitting a
crossed effects logit model to this data with lme4 without exhausting
system memory. I have a quad-core intel linux machine with 8 gigs of
ram and a lot of swap to play with, but I'm still falling short.
Interestingly, I've successfully fit this model using HLM6 on a
machine with substantially less RAM.
My question is largely about feasibility. I would like to use lme4 to
analyze this dataset because it provides a much better set of features
for checking model fit and generating predictions than HLM (one can't
even get the fixed effects variance-covariance matrix out of HLM6's
crossed effects routine). Is this impossible? Are there any ways to
reduce lmer's memory footprint that I might try? Would one expect a
cross-classified logit model with 700,000 observations to require
upwards of 12 gigs of memory or have I uncovered a small memory leak
that isn't visible with smaller datasets? The memory use creeps up
slowly over the course of a run which is at least consistent with a
memory leak, but, not knowing anything about the implementation, I'm
just speculating wildly here. Obviously, I could sub-sample, but this
is already a sample of a larger dataset, so I'm loathe to do that if I
can avoid it.
Department of Political Science
University of Illinois at Urbana-Champaign
702 S. Wright St.
Urbana, IL 61801
Email: dbp at uiuc.edu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 191 bytes
Desc: not available
More information about the R-sig-mixed-models