[R-sig-ME] R-sig-mixed-models Digest, Vol 147, Issue 14

Thu Mar 14 19:39:03 CET 2019

Dear Hein,

I’ll send you (offline) a PDF of the Box-Cox paper.

Best wishes,
Tim
--
Population Policy and Practice Programme
UCL Great Ormond Street Institute of Child Health,
30 Guilford Street, London WC1N 1EH, UK

From: Hein van Lieverloo <hein.van.lieverloo using viaeterna.nl>
Date: Thursday, 14 March 2019 at 18:37
To: "Cole, Tim" <tim.cole using ucl.ac.uk>, "mollieebrooks using gmail.com" <mollieebrooks using gmail.com>
Cc: "r-sig-mixed-models using r-project.org" <r-sig-mixed-models using r-project.org>
Subject: RE: R-sig-mixed-models Digest, Vol 147, Issue 14

Dear Tim,

I will try to study the details of the link Mollie sent and your sitar package later (especially the AICadj and BICadj).
I understand from both your mails that I can safely use the generalized model (most likely the glmmTMB genpois, see my reply to Mollie).
Is there an appropriate article that I can refer to in the article I'm writing to support this decision (I would be obliged if you could send a copy)?

Kind regards,

Hein

From: Cole, Tim <tim.cole using ucl.ac.uk>
Sent: donderdag 14 maart 2019 18:45
To: mollieebrooks using gmail.com
Cc: r-sig-mixed-models using r-project.org; hein.van.lieverloo using viaeterna.nl
Subject: Re: R-sig-mixed-models Digest, Vol 147, Issue 14

Dear Mollie,

That stack exchange link cites Akaike (1978) as saying you can compare models with differently transformed Y variables so long as the transformation’s Jacobian is included. In fact the earlier classic Box & Cox (1964) paper says the same thing and shows how to do it.

The AICadj and BICadj functions in my sitar package adjust for a Box-Cox transformed Y variable, including log(Y). The code below shows that the log transform provides a worse fit:

> library(glmmTMB)
> set.seed(1)
> x=rpois(100, lambda=5)
> AIC(glmmTMB(x~1, family=poisson))
[1] 422.9
> AIC(glmmTMB(log(x)~1))
[1] 128.1

> library(sitar)
> AICadj(glmmTMB(x~1, family=poisson))
glmmTMB(x ~ 1, family = poisson)
                           422.9
> AICadj(glmmTMB(log(x)~1))
glmmTMB(log(x) ~ 1)
              436.2

Best wishes,
Tim Cole
--
Population Policy and Practice Programme
UCL Great Ormond Street Institute of Child Health,
30 Guilford Street, London WC1N 1EH, UK

------------------------------
Date: Thu, 14 Mar 2019 16:34:20 +0100
From: Mollie Brooks <mollieebrooks using gmail.com<mailto:mollieebrooks using gmail.com>>
To: Hein van Lieverloo <hein.van.lieverloo using viaeterna.nl<mailto:hein.van.lieverloo using viaeterna.nl>>
Cc: R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME]  Overdispersed and zero-inflated - or not -
                and if so, how to model them? #glmmTMB
Message-ID: <DE824630-F149-4D53-BCCC-5ED3E1F0480F using gmail.com<mailto:DE824630-F149-4D53-BCCC-5ED3E1F0480F using gmail.com>>
Content-Type: text/plain; charset="utf-8"

Dear Hein,

See replies below...

On 14Mar 2019, at 15:46, Hein van Lieverloo <hein.van.lieverloo using viaeterna.nl<mailto:hein.van.lieverloo using viaeterna.nl>> wrote:
Dear all,
Keywords: #glmmTMB  #overdisp  #zero_count
I am grateful for this mailing list and in advance, for any helpful
response.
This e-mail has two related questions.
Details (summary, background, approach and results) are given below them.
Question 1: my data are zero-inflated and overdispersed, but what does the
overdispersion parameter in glmmTMB (genpois, negbin1, negbin2) tell me?
           It is very high in genpois and negbin1 models (see question 2) and I
thought it should be near 1, like in negbin2 (>> 1 is overdispersed, <<1 is
underdispersed)
           But when I test these generalized models for overdispersion
(overdisp from sjstats), no overdispersion is indicated.

The dispersion parameter in a glmmTMB model is there to handle the dispersion and it’s fine if it’s different from 1. So your tests with sjstats seemed to be correct. For descriptions of how the dispersion parameters relate to the variance, see ?sigma.glmmTMB

Question 2: should I use Gaussian on log(counts) with AIC 2068  or use
negbin2 with AIC 8036 and add overdispersion and zero-inflation models to
get a lower AIC (and if so, how?)
           When I use glmmTMB on counts with poisson, I get an AIC of 117 856.
Testing the model with overdisp and zero_count (from the sjstats package), I
find p = 0 (overdispersed) and zc-ratio 0.81 (probable zero-inflation).
           When I use glmmTMB on log10(counts), with 0's estimated to 0.1 so
resulting in -1, I get an AIC of 2068  (with lmer: 2122). Looks fine, but
may be wrong.
           When I use glmmTMB on counts with either genpois (dispersion par
613), negbinom1 (dispersion par 287) or negbinom2 (dispersion par 0.72), I
get AIC's over 8036. Much higher, but may be ok.

You can’t compare the models of the log-transformed data to the raw data. For example,
set.seed(1)
x=rpois(100, lambda=5)
AIC(glmmTMB(log(x)~1))
[1] 128.0742
AIC(glmmTMB(x~1, family=poisson))
[1] 422.911

or see discussion here https://stats.stackexchange.com/questions/61332/comparing-aic-of-a-model-and-its-log-transformed-version

	[[alternative HTML version deleted]]