[R-sig-ME] glmmTMB: phi in betabinomial dispersion model

Thu Jan 14 16:44:25 CET 2021

   Honestly, the BB parameterization in glmmTMB was chosen because we 
(I) weren't aware of the other options. The one we use is the most 
"natural" way I can think of to construct the BB (compounding a binomial 
with its conjugate prior distribution, analogous to a 
Dirichlet-multinomial or negative binomial).

   Looking at the Prentice paper it seems useful; the two bits that 
would be hard would be (1) rather than a simple closed-form expression 
it is expressed as the combination of sums: for params p (prob) (q=1-p, 
I think) and gamma (dispersion), the log-likelihood of y out of n 
successes is

sum(i=0,y-1) log(p+gamma*i) + sum(i=0,n-y-1) log(q+gamma*i) - 
sum(i=0,n-1) log(1+gamma*i)

  This is not as bad as log-likelihoods that have to be computed via 
infinite sums, but it will presumably be a lot slower than a likelihood 
that involves only log-factorial and log-beta functions, sums and products	

  The other thing that looks potentially tricky is that the lower bound 
of the dispersion parameter is data-dependent:

  y >= max{ -p(n - 1)1, -q(n - 1)-1}

   Since p, q, n will vary across a given data set (and p and q will 
depend on the parameters of the model for the conditional mean), it's 
not immediately obvious to me how we would implement this (glmmTMB 
typically works by fitting parameters on an unconstrained space) ... 
(there are some cruder approaches that would involve penalization ...)

   Feel free to open an issue in the glmmTMB github repository ...

On 1/14/21 3:30 AM, John Maindonald wrote:
> This relates to the betabinomial error family.
> 
> Note that, in the way that the model is formulated in glmmTMB,
> the binomial is a limiting case of the betabinomial as phi goes to
> infinity, not a special case.  This seems to me somewhat unfortunate.
> Was the parameterization used chosen in preference to other possibilities
> for any particular reason?  The sigma parameter in the gamlss implementation
> is 1/phi, with a log link as the default, where phi is the betabinomial sigma
> parameter.  I am not sure what the limiting lower value of intra-class correlation
> rho is, if the link is set as “identity”, but in theory it might do mildly negative.
> 
> Is there any particular reason for the glmmTMB parameterization?
> 
> The theory, if the sigma parameter is suitable parameterized, allows the
> intra-class correlation rho to go negative.  See:
>    Prentice, RL. 1986. “Binary Regression Using an Extended Beta-Binomial Distribution,
>    with Discussion of Correlation Induced by Covariate Measurement Errors.” Journal of
>    the American Statistical Association 81 (394): 321–27.
> Does anyone know of a readily usable implementation of that more general
> error model?
> 
> 
> John Maindonald             email: john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>
> 
> 
> If you specify  "dispformula=~1”, the output is the same as when a dispformula is not
> specified.  Certainly, that is the case for models with a betabinomial error.  One can
> extract phi as “phi=sigma(obj)
> 
> If you specify
> a dispformula with >1 coefficients, then calculate either
> 
>    coef(HCbb.cll)[['disp’]]  ## Gives the coefficients
> or
>    coef(summary(HCbb.cll))[['disp’]]  ## Adds SE, etc, information
> 
> In both cases a logarithmic link function is used.
> 
> The following is a function that I have found useful:
> 
> getRho <-
> function (obj)
> {
>      mm <- model.matrix(obj$modelInfo$allForm$dispformula, data = obj$frame)
>      fixdisp <- fixef(obj)[["disp"]]
>      1/(1 + exp(mm %*% fixdisp))
> }
> 
> If you just want phi, rather than the intra-class correlation rho (which makes more
> intuitive sense to me), just set “phi =  exp(mm %*% fixdisp)”.  I think this works
> in just the same way with nbinom2 errors, but I have not checked the details.
> 
> John Maindonald             email: john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>
> 
> On 13/01/2021, at 21:16, matthias.suter--- via R-sig-mixed-models <r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org>> wrote:
> 
> When fitting a negative binomial model in glmmTMB() using "family=nbinom2", but without specifying the dispersion model, the summary() gives an estimate for the dispersion parameter as "Overdispersion parameter for nbinom2 family (): XX", which represents - as far I understand - the phi value described in the manual (also termed theta in other environments).
> 
> 
> If a dispersion model is specified with "dispformula=~ ...", the summary() gives the estimates for the "Dispersion model". How do these relate to the phi value? I read in the manual that "phi=exp(eta) (where eta is the linear predictor from the dispersion model)". Does this indicate that these estimates of the Dispersion model appear on the log scale, and to get phi, they must be back-transformed?
> 
> 
> [[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>