[R-sig-ME] spaMM::fitme() - a glmm for longitudinal data that accounts for spatial autocorrelation
Francois Rousset
|r@nco|@@rou@@et @end|ng |rom umontpe|||er@|r
Wed Jul 15 00:10:26 CEST 2020
Dear Thierry,
please provide a reproducible example so that we know what you have
actually done.
Best,
F.
Le 14/07/2020 à 20:00, Thierry Onkelinx a écrit :
> Dear François and Sarah,
>
> INLA seems more efficient. I ran a model with Mattern correlation
> structure on 13K locations (1 observation per location) in under 10
> minutes on a laptop with 16GB RAM.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
> AND FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx using inbo.be <mailto:thierry.onkelinx using inbo.be>
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be <http://www.inbo.be>
>
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no
> more than asking him to perform a post-mortem examination: he may be
> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does
> not ensure that a reasonable answer can be extracted from a given body
> of data. ~ John Tukey
> ///////////////////////////////////////////////////////////////////////////////////////////
>
> <https://www.inbo.be>
>
>
> Op di 14 jul. 2020 om 18:22 schreef Francois Rousset
> <francois.rousset using umontpellier.fr
> <mailto:francois.rousset using umontpellier.fr>>:
>
> Dear Sarah,
>
> Le 14/07/2020 à 16:55, Sarah Chisholm a écrit :
> > Hi Mollie, thank you for your suggestion. glmmTMB seems like a good
> > option for my needs as well. In your sample code above, can you
> > explain what the term 'group' does in matern(pos+0|group)? Does
> this
> > allow the spatial correlation structure to be applied to specific
> > groupings in the data (in my case, for example, by 'continent')?
> >
> > Francois, thank you for this very clear answer. This is a very
> > convenient feature of the function! May I ask you a couple of other
> > questions about some issues that I've had with spaMM::fitme()?
> >
> > In particular, when I try fitting this model to a large data set
> (~14
> > 000 rows x 7 columns, ~2 MB), the model will run for an extended
> > period of time, to the point where I've had to terminate the
> > computation. I've tried applying the suggestions that are
> mentioned in
> > the user guide, i.e. setting init=list(lambda=0.1)
> > and init=list(lambda=NaN). Implementing init=list(lambda=0.1)
> returned
> > an error suggesting that there was a lack of memory, while
> running the
> > model with init=list(lambda=NaN) also ran for an extended period of
> > time without completing. Is there something else I can do to
> speed up
> > the fit of these models?
> >
> > I've had a similar problem with an even larger data set (~185
> 000 rows
> > x 8 columns, ~21 MB), where, when I try running the model, this
> error
> > is returned immediately:
> >
> > ErrorinZA %*%xmatrix :Cholmoderror 'problem too large'at file
> > ../Core/cholmod_dense.c,line 105
> >
> > I've tried running this model on two devices, both with a 64-bit OS
> > with Windows 10, one with 32 GB of RAM and the other with 64 GB.
> I've
> > gotten the same error from both devices. Is there a way that
> fitme()
> > can accommodate these large data sets?
>
> spaMM can handle large data sets, but the first issue to consider
> here
> is the number of distinct locations for the spatial random effect.
> The
> large correlation matrices of geostatistical models will always be a
> problem, both in terms of memory requirements and of potentially huge
> computation times. My guess from past experiments is that one should
> still be able to fit models with ~ 10K locations within a few days
> on a
> computer with <60 Gb of RAM (given perhaps some tinkering of the
> arguments), so at least the data set of 14 000 rows should be
> feasible,
> particularly if the number of locations is smaller.
>
> Anyone planning to analyze large spatial data sets should anticipate
> these problems and check by themselves whether there is any practical
> alternative suitable for their particular problem. The discussion in
> section 6.2 of the "gentle introduction" to spaMM may then be useful.
>
> Best,
>
> F.
>
> >
> > Thank you,
> >
> > Sarah
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org
> <mailto:R-sig-mixed-models using r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list