[R-sig-ME] mixed lognormal hurdle model with multiple grouping factors

Mon Dec 23 19:11:58 CET 2019

Hi Guillaume,

The mixed_model in the GLMMadaptive package supports Two-Part Mixed Effects
Model for Semi-Continuous Data (
https://drizopoulos.github.io/GLMMadaptive/articles/ZeroInflated_and_TwoPart_Models.html#zero-inflated-negative-binomial-mixed-effects-model).
It doesn't seem to support mixed effect for the zero outcome component
though. Another downside is that, since it is written in pure R, it is much
slower than glmmTMB (for similar models).

Liming

On Fri, Dec 20, 2019 at 6:31 AM Guillaume Adeux <guillaumesimon.a2 using gmail.com>
wrote:

> Yes, for now, the most seducing solution has been the following (with the
> brms package):
>
>
> fit=brm(bf(H_q~block+syst+(1|plot)+(1|year)+(1|plot:year),hu~block+syst+(1|plot)+(1|year)+(1|plot:year)),data=density,family=hurdle_lognormal(),control
> = list(adapt_delta = 0.999))
>
>  Family: hurdle_lognormal
>   Links: mu = identity; sigma = identity; hu = logit
> Formula: H_q ~ block + syst + (1 | plot) + (1 | year) + (1 | plot:year)
>          hu ~ block + syst + (1 | plot) + (1 | year) + (1 | plot:year)
>    Data: density (Number of observations: 480)
> Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
>          total post-warmup samples = 4000
>
> Group-Level Effects:
> ~plot (Number of levels: 10)
>                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
> Tail_ESS
> sd(Intercept)        0.12      0.11     0.01     0.41 1.00      852
> 912
> sd(hu_Intercept)     1.21      1.01     0.05     3.86 1.01      637
>  1011
>
> ~plot:year (Number of levels: 60)
>                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
> Tail_ESS
> sd(Intercept)        0.13      0.06     0.01     0.23 1.00      737
> 546
> sd(hu_Intercept)     2.10      0.46     1.34     3.15 1.00     1427
>  2435
>
> ~year (Number of levels: 6)
>                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
> Tail_ESS
> sd(Intercept)        0.06      0.07     0.00     0.21 1.00     1497
>  1843
> sd(hu_Intercept)     0.83      0.71     0.03     2.55 1.01      935
>  1469
>
> Population-Level Effects:
>              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
> Intercept       -0.23      0.07    -0.38    -0.08 1.00     2159     1513
> hu_Intercept    -1.04      0.66    -2.26     0.43 1.00     1832     2061
> block1          -0.06      0.06    -0.17     0.07 1.00     2394     1933
> syst1           -0.32      0.17    -0.64     0.03 1.00     1930     1487
> syst2           -0.11      0.12    -0.35     0.13 1.00     2023     1617
> syst3            0.18      0.11    -0.06     0.41 1.00     1724     1267
> syst4            0.08      0.12    -0.16     0.32 1.00     2222     1691
> hu_block1        0.36      0.60    -0.81     1.63 1.00     1658     1684
> hu_syst1         4.19      1.34     1.61     6.91 1.00     1236     1233
> hu_syst2        -0.09      1.17    -2.40     2.38 1.00     1266     1419
> hu_syst3        -1.98      1.30    -4.59     0.54 1.01     1183     1235
> hu_syst4        -0.52      1.23    -2.95     1.88 1.00     1170     1108
>
> Family Specific Parameters:
>       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
> sigma     0.48      0.02     0.44     0.52 1.00     2632     3025
>
> Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
> is a crude measure of effective sample size, and Rhat is the potential
> scale reduction factor on split chains (at convergence, Rhat = 1).
>
>
> I am currently struggling to obtain "syst" means averaged over blocks (i.e.
> what would be done traditionally with emmeans(fit,~syst)) and to compare
> them pairwise.
>
> As of now, I am only capable of retrieving a mean for each combination of
> "syst" and "block" :
>
> newdata =
> expand.grid(syst=levels(density$syst),block=levels(density$block))
> means=fitted(fit,newdata,re_formula=NA,summary=TRUE)
> colnames(means)=c("fit","se","lwr","upr")
> df_plot=cbind(newdata,means)
>
> I am sure this must be simple, particularly with the hypothesis() function
> but I am not sure how to include the "zero-non zero part" (i.e. hu_...).
>
> If anyone can foward some information on this, I would be eternally
> grateful (or even more should I say).
>
> Sincerely,
>
> Guillaume ADEUX
>
>
>
> Le ven. 20 déc. 2019 à 15:12, Ben Bolker <bbolker using gmail.com> a écrit :
>
> >
> >   Good point.
> >   This might be manageable in MCMCglmm or brms (or JAGS) ...
> >
> > On 2019-12-20 2:58 a.m., D. Rizopoulos wrote:
> > > For a hurdle model for repeated measurements data, the dichotomous
> > outcome I(diversity > 0) is also a repeated measurements outcome. Hence,
> in
> > the logistic regression model for this dichotomous outcome you will need
> to
> > include random effects to account for the correlations. And it is logical
> > to assume that the random effects from this logistic regression model
> will
> > be correlated with the random effects of the linear mixed model for only
> > the positive responses.
> > >
> > > In this case the likelihood of the two parts does not split in two
> > functionally independent parts that can be separately maximized. If this
> is
> > indeed the case, then fitting the two parts separately may cause bias and
> > loss of efficiency.
> > >
> > > Best,
> > > Dimitris
> > >
> > > ��
> > > Dimitris Rizopoulos
> > > Professor of Biostatistics
> > > Erasmus University Medical Center
> > > The Netherlands
> > >
> > > ________________________________
> > > From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> on
> > behalf of Mollie Brooks <mollieebrooks using gmail.com>
> > > Sent: Wednesday, December 18, 2019 3:08 PM
> > > To: Help Mixed Models; Guillaume Adeux
> > > Subject: Re: [R-sig-ME] mixed lognormal hurdle model with multiple
> > grouping factors
> > >
> > > Hi Guillaume,
> > >
> > > I don�t think the hurdle lognormal can be fit in a single function call
> > to glmmTMB since the model for the non-zero response requires
> > log-transforming the response. Other types of hurdle models could be fit
> in
> > glmmTMB using the zero-inflation model.
> > >
> > > I don�t think you gain much information in hurdle models by modeling
> the
> > two parts (zeros and non-zeros) in one function call. The only potential
> > benefit to fitting a hurdle in a single function call is that you get
> > likelihood and AIC for the entire data set, but I don�t know if those are
> > produced by brms.
> > >
> > > You could just fit a binomial model for the zero-non-zero process (i.e.
> > monoculture) like
> > >
> mod_binom=lmer((diversity>0)~block+syst+(1|plot)+(1|year)+(1|plot:year),
> > data=density, family=binomial)
> > >
> > > and then fit a model to the log of the positive data
> > >
> mod_gaus=lmer(log(diversity)~block+syst+(1|plot)+(1|year)+(1|plot:year),
> > data=subset(density, diversity>0))
> > >
> > > Or, given that the outcome is non-negative and continuous, it might
> make
> > sense to try a Tweedie distribution, but I�m not sure I�ve seen this
> > applied to diversity indices in the literature. Has anyone else seen this
> > done?
> > > mod_twe = glmmTMB(diversity~block+syst+(1|plot)+(1|year)+(1|plot:year),
> > data=density, family=tweedie)
> > >
> > > Cheers,
> > > Mollie
> > >
> > >> On 18Dec 2019, at 14:45, Cesko Voeten <c.c.voeten using hum.leidenuniv.nl>
> > wrote:
> > >>
> > >> Hi Guillaume,
> > >>
> > >> If you're not afraid to go Bayesian, brms can do it. Alternatively,
> you
> > may be able to use glmmTMB and treat the hurdle part as zero inflation,
> but
> > this is conceptually not the same thing as a hurdle model so you would
> need
> > to judge whether that would make sense at all for your application.
> > >>
> > >> HTH,
> > >> Cesko
> > >>
> > >> Op 18-12-2019 om 13:48 schreef Guillaume Adeux:
> > >>> Hi everyone,
> > >>> I am looking for a package which can handle "hurdle.lognormal"
> > distribution
> > >>> family and multiple grouping factors.
> > >>> GLMMadaptive seemed as the way to go but unfortunately, to the best
> of
> > my
> > >>> knowledge, it does not handle multiple grouping factors (random
> > effects).
> > >>> You may ask why? I am analyzing plant diversity and one of the
> > treatments
> > >>> led to plots which were dominated by one species. Hence, certain
> > diversity
> > >>> indices are estimated as zero in these plots, and produces a mass at
> > zero.
> > >>> All other values are positive and continuous.
> > >>> Anyone have an idea of a package/function which can handle this? Or
> any
> > >>> alternative approach?
> > >>> In lmer syntax, the model is the following:
> > >>>
> >
> mod=lmer(diversity~block+syst+(1|plot)+(1|year)+(1|plot:year),data=density,REML=F)
> > >>> Thank you for your time and help.
> > >>> Sincerely,
> > >>> Guillaume ADEUX
> > >>> [[alternative HTML version deleted]]
> > >>> _______________________________________________
> > >>> R-sig-mixed-models using r-project.org mailing list
> > >>>
> >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C0f8f22d15f244f0190f208d783c3ca4b%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637122749245106064&sdata=XWAMTSylM6j81Lf%2BLb8qikDEho3QIo8l%2Bh%2BnxSpwjHM%3D&reserved=0
> > >>>
> > >>
> > >> _______________________________________________
> > >> R-sig-mixed-models using r-project.org mailing list
> > >>
> >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C0f8f22d15f244f0190f208d783c3ca4b%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637122749245106064&sdata=XWAMTSylM6j81Lf%2BLb8qikDEho3QIo8l%2Bh%2BnxSpwjHM%3D&reserved=0
> > >
> > > _______________________________________________
> > > R-sig-mixed-models using r-project.org mailing list
> > >
> >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=02%7C01%7Cd.rizopoulos%40erasmusmc.nl%7C0f8f22d15f244f0190f208d783c3ca4b%7C526638ba6af34b0fa532a1a511f4ac80%7C0%7C0%7C637122749245106064&sdata=XWAMTSylM6j81Lf%2BLb8qikDEho3QIo8l%2Bh%2BnxSpwjHM%3D&reserved=0
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > >
> > > _______________________________________________
> > > R-sig-mixed-models using r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]