[R-sig-ME] making inferences about overdispersion in nbinom glmmabmb

Thu Jun 20 04:52:19 CEST 2013

Hi Paul,

Standard errors on variance components do not provide a consistent
means to assess either significance or create confidence intervals.
In part because variances are fundamentally bounded, so you really
cannot expect symmetric intervals.  Of course if they are large
enough, that may not be an issue, but as a general principle,
encouraging the use of standard errors assuming a standard normal is
not going to work well.

If you are bound to a frequentist world, a likelihood ratio test is
probably a good choice with a free and constrained estimate.  To
generate confidence intervals, a good approach can be used based on
profile likelihoods.  I am sure these are presented elsewhere, but
Terry Therneau has a good presentation of the technique and theory in
his text on survival models.

Below is an alternative Bayesian approach.  The parameterization of
course is slightly different as this uses MCMCglmm and a non zero
residual variance rather than negative binomial model, but the
posterior allows valid examination and computation of confidence
intervals and presentation of the distribution.  Note for larger
values, it is not an issue, but in the last example (admittedly chosen
as a degenerate case) normal based confidence intervals would perform
quite poorly.

Best,

Josh

require(MASS)
require(MCMCglmm)

set.seed(10)
d <- data.frame(x = rnorm(1000))

d <- within(d, {
  y1 = rnegbin(1000, 2 + .5 * x, 1)
  y2 = rnegbin(1000, 2 + .5 * x, 4)
  y3 = rnegbin(1000, 2 + .5 * x, 16)
})

m <- list(
  t1 = MCMCglmm(y1 ~ x, data = d, family = "poisson", nitt=1e5, verbose=FALSE),
  t4 = MCMCglmm(y2 ~ x, data = d, family = "poisson", nitt=1e5, verbose=FALSE),
  t16 = MCMCglmm(y3 ~ x, data = d, family = "poisson", nitt=1e5, verbose=FALSE))

plot(m$t1$VCV)
plot(m$t4$VCV)
plot(m$t16$VCV)

HPDinterval(m$t1$VCV)
HPDinterval(m$t4$VCV)
HPDinterval(m$t16$VCV)

p <- function(x) {
  hist(x, freq=FALSE)
  lines(seq(min(x), max(x), length.out = 1000),
  dnorm(seq(min(x), max(x), length.out = 1000), mean= mean(x), sd = sd(x)),
  type = "l", col = "blue")
}

p(m$t1$VCV[, 1])
p(m$t4$VCV[, 1])
p(m$t16$VCV[, 1])

On Wed, Jun 19, 2013 at 5:17 AM, Paul Johnson
<paul.johnson at glasgow.ac.uk> wrote:
> Hi,
>
> I'm fitting negative binomial GLMMs in glmmadmb using  glmmabmb(…, family = "nbinom"). I'm interested in making inferences about the amount of overdispersion in each model, and comparing overdispersion between models. The outcome is mosquito count and the different models are different designs of mosquito trap. Each model fit gives an estimate of alpha (the [inverse] overdispersion parameter) and sd_alpha, the standard error of alpha. It's tempting to use the standard error of alpha to construct CIs, test for differences, etc, but I have no idea if this is justified. It seems over-optimistic to assume that sampling error in alpha is approximately normally distributed. Are there conditions (e.g large sample size) where this assumption is justified?
>
> Thanks for your help,
> Paul Johnson
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com