[R-sig-ME] Fwd: model check for negative binomial model
Salvador SANCHEZ COLON
@@|v@dor@@nchezco|on @end|ng |rom prod|gy@net@mx
Tue Feb 18 04:27:38 CET 2020
CORRECTION
Dear Alessandra,
My apologies, I msread the output you pasted in your previous messages. I did not realize that the parameter estimates that you were providing correspond to a negative binomial model, I assumed that they came from a binomial (logit) model. As Profr. Bolker mentioned, you cannot fit a negative binomial model to your data, it has to be a binomial (logit model) in which you model the number of eggs that hatch out of the total number of eggs in the nest. What the model predicts (after back transforming the odds ratio) is the probability, p, of an egg hatching, which can then be translated into the expected number of eggs that hatch simply by multiplying it by the number of eggs in the nest, np.
I hope this helps and, again, my apologies for the confusion and for messing things yp.
Salvador
En Lun, 17 Febrero, 2020 en 20:46, yo <salvadorsanchezcolon using prodigy.net.mx> escribió:
Para: bielli.alessandra using gmail.com
Cc: bbolker using gmail.com; r-sig-mixed-models using r-project.org
Cara Alessandra,
If I may interject into your conversation, the key to your question lies in the parameter estimates:for the fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.38864 0.08227 -16.879 < 2e-16 ***
Relocation..Y.N.Y 0.32105 0.09152 3.508 0.000452 ***
SPL 0.22218 0.08793 2.527 0.011508 *
As I understand your design, you have two independent variables (factors): Species and treatment, each with two levels: L vs. G and Y vs N.
Thus, the logit model that you obtain (omitting the random effects) is:
logit(p) = log(p/(1-p) = -1.38864 + 0.32105*Treatment + 0.22218*Species
Treatment and species are two-level factors which are coded as 0 or 1 and the model has to have parameter estimates for each of those levels. By design, in generalized linear models parameters are estimated by taking one of the levels of each factor as the reference level and assigned a value of 0, and the parameters for the other levels are expressed in relation to the reference level. Thus, in your case, treatment N and species G are designated as the reference levels for their corresponding factors and, hence, their parameter values are both 0 (and not listed in the output). Then, treatment Y has a positive effect (0.32) on the odds ratio of hatching/not-hatching compared to treatment N; and species L also has a positive effect (0.22 times) on the odds ratio of hatching vs not-hatching.
The intercept then (-1.38864) denotes the odds ratio for the combination of treatment N and species G; that is, when both factors are at their reference level of 0. This translates into a probability of hatching (p):
p = exp(-1.38864)/(1+exp(-1.38864) = 0.1996
For treatment Y and species G, the model becomes:
p = exp(-1.38864 + 0.32105)/(1+exp(-1.38864 + 0.32105) = 0.25569
as treatment Y has a positive effect on the odds ratio of hatching.
For treatment N and species L, the model becomes:
p = exp(-1.38864 + 0.22218)/(1+exp(-1.38864 + 0.22218) = 0.2375
as species L has a positive effect on the odds ratio of hatching.
Finally, for the combination of treament Y and species L, the model becomes:
p = exp(-1.38864 + 0.32105 + 0.22218)/(1+exp(-1.38864 + 0.32105 + 0.22218) = 0.3002
as both levels Y and L have positive effects on the odds ratio of hatching.
I hope this helps.
Best regards,
Salvador
En Lun, 17 Febrero, 2020 en 18:2, Alessandra Bielli <bielli.alessandra using gmail.com> escribió:
Para: Ben Bolker
Cc: r-sig-mixed-models using r-project.orgDear Ben
I am trying to make a prediction for the combination of species (L or G)
and treatment (control/experiment).
I am still confused about the prediction values. I would like to present
results as a success rate for a nest, to say that treatment
increases/decreases success by ...%. But the value I have is the
probability that 1 egg in the nest succeeds, correct? I am not sure how to
use these predictions.
Thanks for your help!
Alessandra
On Mon, Feb 17, 2020 at 2:15 PM Ben Bolker <bbolker using gmail.com> wrote:
>
> That's correct. There are some delicate issues about prediction:
>
> * do you want to use the original (potentially unbalanced) data for
> prediction? (That's what you're doing here).
> * or, do you want to make predictions for a "typical" nest and week
> combination, in which case you would use
>
>
> pframe <- with(your_data,
> expand.grid(Relocation..Y.N.=unique(Relocation..Y.N.),
> SP=unique(SP))
> predict(m.unhatched,type="response",re.form=NA,newdata=pframe))
>
> You could also use expand.grid() to generate a balanced design (i.e.
> all combinations of weeks and nests), which would give yet another answer.
>
> There are a lot of packages designed for doing these kinds of
> post-fitting manipulations (e.g. 'margins', 'emmeans'), you might find
> them useful ...
>
>
>
> On 2020-02-17 1:48 p.m., Alessandra Bielli wrote:
> > Dear Thierry
> >
> > Thanks for your reply.
> >
> > I read a bit about the prediction for a binomial model with
> > success/failures and I have a couple of questions.
> >
> > If I use the predict function with the model you recommended, I obtain
> > log.odds or probabilities if I use "type=response":
> >
> >
> tapply(predict(m.unhatched,type="response"),list(main$SP,main$Relocation..Y.N.),mean)
> > N Y
> > G 0.7314196 0.6414554
> > L 0.6983576 0.6003087
> >
> > Are these probabilities of success (i.e. hatched) in one nest?
> >
> > Thanks,
> >
> > Alessandra
> >
> > On Mon, Feb 17, 2020 at 7:18 AM Thierry Onkelinx
> > <thierry.onkelinx using inbo.be <mailto:thierry.onkelinx using inbo.be>> wrote:
> >
> > Dear Alessandra,
> >
> > Since you have both the number hatched and the total clutch size you
> > can calculate the number of successes and failures. That is
> > sufficient for a binomial distribution.
> >
> > glmer(cbind(Hatched, Unhatched) ~ Relocation..Y.N. + SP + (1 |
> > Beach_ID) + (1 | Week), family = binmial)
> >
> > A negative binomial or Poisson allow predictions larger than the
> > offset. Which is nonsense given that the number hatched cannot
> > surpass the total clutch size.
> >
> > Best regards,
> >
> > ir. Thierry Onkelinx
> > Statisticus / Statistician
> >
> > Vlaamse Overheid / Government of Flanders
> > INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR
> > NATURE AND FOREST
> > Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> > thierry.onkelinx using inbo.be <mailto:thierry.onkelinx using inbo.be>
> > Havenlaan 88 bus 73, 1000 Brussel
> > www.inbo.be <http://www.inbo.be>
> >
> >
> ///////////////////////////////////////////////////////////////////////////////////////////
> > To call in the statistician after the experiment is done may be no
> > more than asking him to perform a post-mortem examination: he may be
> > able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does
> > not ensure that a reasonable answer can be extracted from a given
> > body of data. ~ John Tukey
> >
> ///////////////////////////////////////////////////////////////////////////////////////////
> >
> > <https://www.inbo.be>
> >
> >
> > Op wo 12 feb. 2020 om 18:42 schreef Alessandra Bielli
> > <bielli.alessandra using gmail.com <mailto:bielli.alessandra using gmail.com>>:
> >
> > Dear Ben
> >
> > Thanks for your quick response.
> >
> > Yes, emergence success is usually between 60 and 80% or higher.
> > I am not sure how to use a binomial, if my data are counts?
> >
> > Can you explain why the approximation doesn't work well if
> > success gets
> > much above 50%? Does it make sense, then, to have "unhatched" as
> > dependent
> > variable, so that I predict mortality (usually below 50%) using
> > a nb with
> > offset(log(total clutch)) ?
> >
> > > summary(m.emerged)
> > Generalized linear mixed model fit by maximum likelihood (Laplace
> > Approximation) ['glmerMod']
> > Family: Negative Binomial(2.2104) ( log )
> > Formula: Unhatched ~ Relocation..Y.N. + SP +
> > offset(log(Total_Clutch)) +
> > (1 | Beach_ID) + (1 | Week)
> > Data: main
> >
> > AIC BIC logLik deviance df.resid
> > 5439.4 5466.0 -2713.7 5427.4 614
> >
> > Scaled residuals:
> > Min 1Q Median 3Q Max
> > -1.4383 -0.7242 -0.2287 0.4866 4.0531
> >
> > Random effects:
> > Groups Name Variance Std.Dev.
> > Week (Intercept) 0.003092 0.0556
> > Beach_ID (Intercept) 0.025894 0.1609
> > Number of obs: 620, groups: Week, 31; Beach_ID, 8
> >
> > Fixed effects:
> > Estimate Std. Error z value Pr(>|z|)
> > (Intercept) -1.38864 0.08227 -16.879 < 2e-16 ***
> > Relocation..Y.N.Y 0.32105 0.09152 3.508 0.000452 ***
> > SPL 0.22218 0.08793 2.527 0.011508 *
> > ---
> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Correlation of Fixed Effects:
> > (Intr) R..Y.N
> > Rlct..Y.N.Y -0.143
> > SPL -0.540 -0.038
> >
> > Thanks,
> >
> > Alessandra
> >
> > On Tue, Feb 11, 2020 at 7:29 PM Ben Bolker <bbolker using gmail.com
> > <mailto:bbolker using gmail.com>> wrote:
> >
> > >
> > > Short answer: if emergence success gets much above 50%, then
> the
> > > approximation you're making (Poisson + offset for binomial, or
> > NB +
> > > offset for negative binomial) doesn't work well. You might
> try a
> > > beta-binomial (with glmmTMB) or a binomial + an
> > observation-level random
> > > effect.
> > >
> > > (On the other hand, your intercept is -0.3, which
> > corresponds to a
> > > baseline emergence of 0.42 - not *very* high (but some beaches
> > and years
> > > will be well above that ...)
> > >
> > > Beyond that, are there any obvious patterns of mis-fit in the
> > > predicted values ... ?
> > >
> > > On 2020-02-11 8:09 p.m., Alessandra Bielli wrote:
> > > > Dear list
> > > >
> > > > I am fitting a poisson model to estimate the effect of a
> > treatment on
> > > > emergence success of hatchlings. To estimate emergence
> > success, I use
> > > > number of emerged and an offset(log(total clutch).
> > > >
> > > > However, overdispersion was detected:
> > > >
> > > >> overdisp_fun(m.emerged) #overdispersion detected
> > > >
> > > > chisq ratio rdf p
> > > > 3490.300836 5.684529 614.000000 0.000000
> > > >
> > > > Therefore, I switched to a negative binomial. I know
> > overdispersion is
> > > not
> > > > relevant for nb models, but the model plots don't look too
> > good. I also
> > > > tried to fit a poisson model with OLRE, but still the plots
> > don't look
> > > > good.
> > > > How do I know if my model is good enough, and what can I do
> > to improve
> > > it?
> > > >
> > > >> summary(m.emerged)
> > > > Generalized linear mixed model fit by maximum likelihood
> > (Laplace
> > > > Approximation) ['glmerMod']
> > > > Family: Negative Binomial(7.604) ( log )
> > > > Formula: Hatched ~ Relocation..Y.N. + SP +
> > offset(log(Total_Clutch)) + (1
> > > > |Beach_ID) + (1 | Year)
> > > > Data: main
> > > >
> > > > AIC BIC logLik deviance df.resid
> > > > 6015.6 6042.2 -3001.8 6003.6 614
> > > >
> > > > Scaled residuals:
> > > > Min 1Q Median 3Q Max
> > > > -2.6427 -0.3790 0.1790 0.5242 1.6583
> > > >
> > > > Random effects:
> > > > Groups Name Variance Std.Dev.
> > > > Beach_ID (Intercept) 0.004438 0.06662
> > > > Year (Intercept) 0.001640 0.04050
> > > > Number of obs: 620, groups: Beach_ID, 8; Year, 5
> > > >
> > > > Fixed effects:
> > > > Estimate Std. Error z value Pr(>|z|)
> > > > (Intercept) -0.29915 0.04055 -7.377 1.62e-13 ***
> > > > Relocation..Y.N.Y -0.16402 0.05052 -3.247 0.00117 **
> > > > SPL -0.08311 0.04365 -1.904 0.05689 .
> > > > ---
> > > > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
> 1
> > > >
> > > > Correlation of Fixed Effects:
> > > > (Intr) R..Y.N
> > > > Rlct..Y.N.Y -0.114
> > > > SPL -0.497 -0.054
> > > >
> > > >
> > > > Thanks for your help,
> > > >
> > > > Alessandra
> > > >
> > > >
> > > > _______________________________________________
> > > > R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org> mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > >
> > >
> > > _______________________________________________
> > > R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org> mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
.
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list