[R-sig-ME] Pulling specific parameters from models to prevent exhausting memory.

Mon Oct 19 06:50:57 CEST 2020

Thanks, Cesko. I'll look into BAM.

James
________________________________
From: Voeten, C.C. <c.c.voeten using hum.leidenuniv.nl>
Sent: Sunday, October 18, 2020 1:16 AM
To: Ades, James <jades using health.ucsd.edu>; r-sig-mixed-models using r-project.org <r-sig-mixed-models using r-project.org>
Subject: RE: Pulling specific parameters from models to prevent exhausting memory.

Hi James,

You may have luck using mgcv::bam instead of lme4. It can also fit random-slopes models and is optimized for "big data", in terms of memory usage and computational efficiency. The modeling syntax is slightly different, though; the correct translation of lme4 random effects into mgcv's s(...,bs='re') terms depends on whether timepoint.nu is a covariate or a factor.

HTH,
Cesko

> -----Original Message-----
> From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> On
> Behalf Of Ades, James
> Sent: Sunday, October 18, 2020 2:01 AM
> To: r-sig-mixed-models using r-project.org
> Subject: [R-sig-ME] Pulling specific parameters from models to prevent
> exhausting memory.
>
> Hi all,
>
> I'm modeling fMRI imaging data using lme4. There are 4 time points and
> roughly 550 subjects with 27,730 regions of interest (these are the variables).
> Since I have access to a super computer, my thought was to create a long
> dataset with a repeated measures of regions of interest per time point and
> then subjects over the 4 time points. I'm using the model below. I gather the
> regions of interest using the super computer because it ends up being
> roughly 70 million something observations. Timepoint is discrete and
> timepoint.nu is just numerical time point.
>
> lmer(connectivity ~ roi * timepoint + (timepoint.nu|subjectID) +
> (timepoint.nu|subjectID:roi), na.action = 'na.exclude', control =
> lmerControl(optimizer = "nloptwrap", calc.derivs = FALSE), REML = FALSE,
> data)
>
> I received back the following error: "cannot allocate vector of size 30206.2
> GbExecution halted"
>
> So I'm wondering how I can only pull the essential parameters I need (group
> means vs individual fixed effects) while modeling, such that the super
> computer can finish the job without exhausting the memory. I say group
> means because I will eventually be adding in covariates.
>
> Also, the super computer rules are that the job must finish within two days.
> I'm not sure that this would, so I'm wondering whether there is any way to
> parallel code in lme4 such that I could make access of multiple cores and
> nodes.
>
> I've included a slice of data here:
> https://drive.google.com/file/d/1mhTj6qZZ2nT35fXUuYG_ThQ-QtWbb-
> 8L/view?usp=sharing
>
> Thanks much,
>
> James
>
>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]