[R-sig-ME] How to know if random intercepts and slopes are, necessary for glmer.nb model

Tue Oct 20 12:42:48 CEST 2015

On 20/10/2015 11:32, David Jones wrote:
> Dear Alain - Thank you for these suggestions. In response to your 
> questions:
>
> Poisson GLMM equivalents often run for these models (though I get the 
> warning, "Model is nearly unidentifiable: very large eigenvalue"). For 
> the models that do fit without problems, overdispersion tests do 
> reflect overdispersion, and negative binomial model equivalents 
> reflect better fit based on a chi-square comparison of -2LL.

David...a better LL (or significant Chi-square test) is not an excuse 
for applying an NB GLM or NB GLMM. Overdispersion can be caused by at 
least 10 different causes (and each requiring a different 
solution)...and you need to pinpoint what is driving the overdispersion. 
If you pick the wrong cause, then you may up with a wrong model.

However....you state that the overdispersion is due to a few patients 
who stay for a long time in the hospital. That would be an argument in 
favour of using the NB GLMM. But also for using a Poisson GLMM with an 
observation level random intercept. The later one is much faster to 
estimate. It is not my favourite model....but being pragmatic......it is 
perhaps the way forward.

Setting the theta to a fixed value in glmer.nb will certainly help.

Kind regards,

Alain

PS...is length of stay in a hospital not strictly positive? Not that I 
want to suggest to use a zero truncated distribution for a data set with 
500,000 observations....:-)

> The DV is length of stay in hospital and the overdispersion is due to 
> some patients who stay for a very long time. For hospital count, there 
> are over 150 hospitals.
> //
>
> On Tue, Oct 20, 2015 at 6:18 AM, Highland Statistics Ltd 
> <highstat at highstat.com <mailto:highstat at highstat.com>> wrote:
>
>
>
>
>         ----------------------------------------------------------------------
>
>         Message: 1
>         Date: Mon, 19 Oct 2015 08:59:40 -0400
>         From: David Jones <david.tn.jones at gmail.com
>         <mailto:david.tn.jones at gmail.com>>
>         To: r-sig-mixed-models at r-project.org
>         <mailto:r-sig-mixed-models at r-project.org>
>         Subject: [R-sig-ME] How to know if random intercepts and
>         slopes are
>                 necessary for glmer.nb model
>         Message-ID:
>                
>         <CAJgUswL0mkbgpv-Xt1MsPtVbm9qGUZ+uaJ+wugPZw8Dvh-XcLA at mail.gmail.com
>         <mailto:CAJgUswL0mkbgpv-Xt1MsPtVbm9qGUZ%2BuaJ%2BwugPZw8Dvh-XcLA at mail.gmail.com>>
>         Content-Type: text/plain; charset="UTF-8"
>
>         I am receiving a number of different warnings/errors when
>         running glmer.nb
>         on a fairly large dataset (N>500,000). For some of the models
>         I have run,
>         program-reported errors prevent the generation of estimates. I
>         suspect that
>         it is because the random effects are very small. I have tried
>         models with
>         random intercepts, as well as models with both random
>         intercepts and slopes
>         (all models include fixed effects). I am running models on a
>         dataset which
>         in theory would include random effects (patients nested within
>         hospitals).
>
>         My question is: how do you know if random intercepts and
>         slopes are
>         necessary, if you can't even estimate the random effects
>         models (and thus
>         use a model comparison test)? As I am aware you can look at
>         design effects
>         to evaluate if a random intercept is necessary (though please
>         correct me if
>         I am wrong here).
>
>         Some example code I have used is below - many thanks.
>
>         a2 <- as.factor(analysis$Location)
>         NBIntercept<- glmer.nb(y ~ a2 + (1 | Hospital), data = analysis)
>         NBInterceptSlope <- glmer.nb(y ~ a2 + (1 | Hospital) + (1 + a2
>         | Hospital),
>         data = analysis)
>
>                 [[alternative HTML version deleted]]
>
>
>     David....this is a little bit a 'Gandalf' question. Perhaps you
>     should first figure out why the NB GLMM does not run. How many
>     hospitals do you have. Perhaps you can set the theta parameter in
>     glmer.nb to a fixed value (use an interval with nearly the same
>     lower and upper limit).... and get the (log of ) theta from a
>     nearby NB GLM model. That would certainly make the estimation
>     process easier!
>
>     Why are you doing an NB GLMM? Do the Poisson GLMM equivalents run?
>     I assume you had overdispersion. What was driving the overdispersion?
>
>     And if computing time is slow for the second NB GLMM model, fit
>     the first model and see whether there are any a2 effects per
>     hospital in the residuals of the first model.
>
>
>     Alain
>
>
>
>
>     -- 
>     Dr. Alain F. Zuur
>
>     First author of:
>     1. Beginner's Guide to GAMM with R (2014).
>     2. Beginner's Guide to GLM and GLMM with R (2013).
>     3. Beginner's Guide to GAM with R (2012).
>     4. Zero Inflated Models and GLMM with R (2012).
>     5. A Beginner's Guide to R (2009).
>     6. Mixed effects models and extensions in ecology with R (2009).
>     7. Analysing Ecological Data (2007).
>
>     Highland Statistics Ltd.
>     9 St Clair Wynd
>     UK - AB41 6DZ Newburgh
>     Tel:   0044 1358 788177
>     Email: highstat at highstat.com <mailto:highstat at highstat.com>
>     URL: www.highstat.com <http://www.highstat.com>
>
>

-- 
Dr. Alain F. Zuur

First author of:
1. Beginner's Guide to GAMM with R (2014).
2. Beginner's Guide to GLM and GLMM with R (2013).
3. Beginner's Guide to GAM with R (2012).
4. Zero Inflated Models and GLMM with R (2012).
5. A Beginner's Guide to R (2009).
6. Mixed effects models and extensions in ecology with R (2009).
7. Analysing Ecological Data (2007).

Highland Statistics Ltd.
9 St Clair Wynd
UK - AB41 6DZ Newburgh
Tel:   0044 1358 788177
Email: highstat at highstat.com
URL:   www.highstat.com

	[[alternative HTML version deleted]]