[R-sig-ME] How to know if random intercepts and slopes are, necessary for glmer.nb model

Tue Oct 20 13:24:02 CEST 2015

Thank you for this input Alain - it will be very helpful as I hammer things
out. Reviewers did specifically ask for negative binomial so I am trying to
give them what they want ... not that this would be a good excuse if it
were the only rationale. I do share with you with a cringe at the prospects
of running a zero truncated model on this dataset :)

Also, following up on Ben Bolker's answers,

I found in a couple of previous posts by Ben that absolute gradient below
.001 or so is good (sorry I didn't see the first time when I had searched -
posts can be found at https://goo.gl/SxUev4 and https://goo.gl/LnyDGA). So,
correct me if I am wrong, but given that all of my absolute gradients were
well below this, looks like things are good for the random intercept models
at least! The random slopes model results are tough, but I probably can
live without them if necessary ... perhaps will eventually find a way to
see if it is negligible random slopes that is causing this or another
reason, and any suggestions given the warnings/errors are welcome (I am
happy to provide more info if it is helpful).

On Tue, Oct 20, 2015 at 6:42 AM, Highland Statistics Ltd <
highstat at highstat.com> wrote:

>
>
> On 20/10/2015 11:32, David Jones wrote:
>
> Dear Alain - Thank you for these suggestions. In response to your
> questions:
>
> Poisson GLMM equivalents often run for these models (though I get the
> warning, "Model is nearly unidentifiable: very large eigenvalue"). For the
> models that do fit without problems, overdispersion tests do reflect
> overdispersion, and negative binomial model equivalents reflect better fit
> based on a chi-square comparison of -2LL.
>
>
> David...a better LL (or significant Chi-square test) is not an excuse for
> applying an NB GLM or NB GLMM. Overdispersion can be caused by at least 10
> different causes (and each requiring a different solution)...and you need
> to pinpoint what is driving the overdispersion. If you pick the wrong
> cause, then you may up with a wrong model.
>
> However....you state that the overdispersion is due to a few patients who
> stay for a long time in the hospital. That would be an argument in favour
> of using the NB GLMM. But also for using a Poisson GLMM with an observation
> level random intercept. The later one is much faster to estimate. It is not
> my favourite model....but being pragmatic......it is perhaps the way
> forward.
>
> Setting the theta to a fixed value in glmer.nb will certainly help.
>
>
> Kind regards,
>
> Alain
>
> PS...is length of stay in a hospital not strictly positive? Not that I
> want to suggest to use a zero truncated distribution for a data set with
> 500,000 observations....:-)
>
>
>
> The DV is length of stay in hospital and the overdispersion is due to some
> patients who stay for a very long time. For hospital count, there are over
> 150 hospitals.
>
>
>
> On Tue, Oct 20, 2015 at 6:18 AM, Highland Statistics Ltd <
> <highstat at highstat.com>highstat at highstat.com> wrote:
>
>>
>>
>>
>> ----------------------------------------------------------------------
>>>
>>> Message: 1
>>> Date: Mon, 19 Oct 2015 08:59:40 -0400
>>> From: David Jones <david.tn.jones at gmail.com>
>>> To: r-sig-mixed-models at r-project.org
>>> Subject: [R-sig-ME] How to know if random intercepts and slopes are
>>>         necessary for glmer.nb model
>>> Message-ID:
>>>         <
>>> CAJgUswL0mkbgpv-Xt1MsPtVbm9qGUZ+uaJ+wugPZw8Dvh-XcLA at mail.gmail.com>
>>> Content-Type: text/plain; charset="UTF-8"
>>>
>>> I am receiving a number of different warnings/errors when running
>>> glmer.nb
>>> on a fairly large dataset (N>500,000). For some of the models I have run,
>>> program-reported errors prevent the generation of estimates. I suspect
>>> that
>>> it is because the random effects are very small. I have tried models with
>>> random intercepts, as well as models with both random intercepts and
>>> slopes
>>> (all models include fixed effects). I am running models on a dataset
>>> which
>>> in theory would include random effects (patients nested within
>>> hospitals).
>>>
>>> My question is: how do you know if random intercepts and slopes are
>>> necessary, if you can't even estimate the random effects models (and thus
>>> use a model comparison test)? As I am aware you can look at design
>>> effects
>>> to evaluate if a random intercept is necessary (though please correct me
>>> if
>>> I am wrong here).
>>>
>>> Some example code I have used is below - many thanks.
>>>
>>> a2 <- as.factor(analysis$Location)
>>> NBIntercept<- glmer.nb(y ~ a2 + (1 | Hospital), data = analysis)
>>> NBInterceptSlope <- glmer.nb(y ~ a2 + (1 | Hospital) + (1 + a2 |
>>> Hospital),
>>> data = analysis)
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>>
>> David....this is a little bit a 'Gandalf' question. Perhaps you should
>> first figure out why the NB GLMM does not run. How many hospitals do you
>> have. Perhaps you can set the theta parameter in glmer.nb to a fixed value
>> (use an interval with nearly the same lower and upper limit).... and get
>> the (log of ) theta from a nearby NB GLM model. That would certainly make
>> the estimation process easier!
>>
>> Why are you doing an NB GLMM? Do the Poisson GLMM equivalents run? I
>> assume you had overdispersion. What was driving the overdispersion?
>>
>> And if computing time is slow for the second NB GLMM model, fit the first
>> model and see whether there are any a2 effects per hospital in the
>> residuals of the first model.
>>
>>
>> Alain
>>
>>
>>
>>
>> --
>> Dr. Alain F. Zuur
>>
>> First author of:
>> 1. Beginner's Guide to GAMM with R (2014).
>> 2. Beginner's Guide to GLM and GLMM with R (2013).
>> 3. Beginner's Guide to GAM with R (2012).
>> 4. Zero Inflated Models and GLMM with R (2012).
>> 5. A Beginner's Guide to R (2009).
>> 6. Mixed effects models and extensions in ecology with R (2009).
>> 7. Analysing Ecological Data (2007).
>>
>> Highland Statistics Ltd.
>> 9 St Clair Wynd
>> UK - AB41 6DZ Newburgh
>> Tel:   0044 1358 788177
>> Email: highstat at highstat.com
>> URL:   www.highstat.com
>>
>>
>
> --
> Dr. Alain F. Zuur
>
> First author of:
> 1. Beginner's Guide to GAMM with R (2014).
> 2. Beginner's Guide to GLM and GLMM with R (2013).
> 3. Beginner's Guide to GAM with R (2012).
> 4. Zero Inflated Models and GLMM with R (2012).
> 5. A Beginner's Guide to R (2009).
> 6. Mixed effects models and extensions in ecology with R (2009).
> 7. Analysing Ecological Data (2007).
>
> Highland Statistics Ltd.
> 9 St Clair Wynd
> UK - AB41 6DZ Newburgh
> Tel:   0044 1358 788177
> Email: highstat at highstat.com
> URL:   www.highstat.com
>
>

	[[alternative HTML version deleted]]