[R-sig-ME] more on nbinom1 vs 2

Fri Oct 23 03:49:37 CEST 2020

   Very short answer: all your insights here look correct, and well 
expressed.

   I think the problem with your earlier aggregation question (I vaguely 
remember it ) is a fairly common one with well-posed but moderately 
interesting/difficult questions: questions that take more than a few 
minutes to come up with an adequate answer/understanding, and that don't 
happen to be in someone's wheelhouse -- so that they've either thought 
about it before and have an answer ready, *or* it's worth it to them to 
take some time to work on it -- often get neglected and gradually sink 
into the pile.

  This is one of the advantages of forums like StackOverflow or 
CrossValidated that (1) are much easier to search for old questions; (2) 
allow people to offer 'brownie points' for solutions to interesting 
questions.  (I think a sufficient interval has gone by that it would be 
reasonable to cross-post it to CrossValidated ...)

On 10/22/20 9:40 PM, Don Cohen wrote:
> 
> I'm still hoping to see some reaction to my message of 10-16
> on aggregation of count data.
> 
> In the mean while, here's an attempt to explain something related.
> I'm again hoping for feedback - is this all correct, am I missing
> something important?
> 
> I now think I (finally) understand that nbinom1 is really the SAME
> distribution as nbinom2.  How well a set of values fits a single NB
> distribution has nothing to do with whether the distribution is
> described by the parameters of nbinom1 or those of nbinom2.
> It is a set of different NB distributions that can fit one better
> than the other, and most models actually do predict a set of
> distributions rather than just one.  In particular, if there
> are covariates, then a different distribution is predicted for
> each value of the covariates.
> 
> If there are no covariates, then there should be no difference
> between nbinom1 and nbinom2, except for different overdispersion
> parameters predicting the same variance.  (This variance is
> presumably observed in the different result values.)
> 
> Getting rid of covariates,
> if glmmTMB(result~1,family=nbinom1,data=D) says
> 
>   Overdispersion parameter for nbinom1 family (): x
>   with (Intercept) y
> 
> while glmmTMB(result~1,family=nbinom2,data=D) says
> 
>   Overdispersion parameter for nbinom2 family (): z
>   with (Intercept) w
> 
> then y better be the same as w, since the mean would be
> exp(y) in the first case and exp(w) in the second.
> Similarly the variance would be
>   mean * (1 + param) = exp(y) * (1+x) in the first case and
>   mean * (1 + (mean/param)) = exp(w) * (1+ (exp(w)/z)) in the second.
> which again should be the same value.
> 
> This was indeed what I found when I tried it.
> This remained true when I added an offset: result~offset(log(exposure))
> 
> However, when I added a random effect: result ~ (1|group)
> I was surprised to get different results for nbinom1 and nbinom2, i.e.,
> different AIC and different intercept.
> I also noticed a difference in the variance of the random effect.
> 
> I now think I understand why.  The random effect allows different
> means and variances for different groups, and this (unlike any
> previous examples) can agree with nbinom1 better or worse than
> nbinom2, depending on whether the relation between the means and
> variances of the groups is closer to linear or quadratic.
> 
> Perhaps I should stop here and wait for replies before moving
> on to how this is related to the aggregation issue in the
> earlier message.
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>