[R-sig-ME] MCMCglmm for binomial models?

Mon Sep 20 19:03:28 CEST 2010

Dear Bob,

As Thierry points out the main difference is that MCMCglmm  
automatically fits a "residual" variance and in this case (in most  
cases!) the data are over-dispersed. I am sure glmer gives the right  
answer for the model it is trying to fit, but I would say the model is  
wrong. For example, if you fit an observation level random effect in  
glmer you will find that the herd variance is more than an order of  
magnitude smaller than what it was:

cbpp$id<-as.factor(1:dim(cbpp)[1])

gm3b <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd)+(1|id),
              family = binomial, data = cbpp)

summary(gm3b)

Generalized linear mixed model fit by the Laplace approximation
Formula: cbind(incidence, size - incidence) ~ period + (1 | herd) + (1  
|      id)
    Data: cbpp
    AIC   BIC logLik deviance
  102.7 114.8 -45.34    90.68
Random effects:
  Groups Name        Variance Std.Dev.
  id     (Intercept) 0.794023 0.89108
  herd   (Intercept) 0.033835 0.18394
Number of obs: 56, groups: id, 56; herd, 15

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)  -1.5003     0.2888  -5.196 2.04e-07 ***
period2      -1.2265     0.4735  -2.591  0.00958 **
period3      -1.3288     0.4884  -2.721  0.00651 **
period4      -1.8663     0.5906  -3.160  0.00158 **
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
         (Intr) perid2 perid3
period2 -0.594
period3 -0.576  0.353
period4 -0.476  0.292  0.282

With respect to the prior - MCMCglmm can hit (and does with this  
problem occasionally) numerical problems with near-singular covariance  
matrices when priors aren't used.  Here are some priors that may be of  
use:

  prior1=list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=0.002)))

This fits an inverse-Wishart prior for both the units (residual)  
variance (the R element) and an inverse-Wishart for the herd variance  
(the first and only element of G). If you fitted >1 random term G  
would be a list with an additional element (you could call it G2 or  
anything you like). This prior is equivalent to an inverse gamma prior  
with shape=scale=0.001, which used to be used a lot in WinBUGS. It is  
now known to have poor properties when a variance is close to zero and  
so an alternative recommendation is to use parameter expanded priors.  
Parameter expansion was originally developed for computational reasons  
but it turns out that it induces scaled non-central F priors for the  
variance components. These can have nicer properties (see section 8 of  
the CourseNotes or the Gelman reference). As an example:

   prior2=list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=1,  
alpha.mu=0, alpha.V=100)))

For both priors the results are broadly similar and give answers close  
to glmer:

mc3b<-MCMCglmm(cbind(incidence, size - incidence) ~ period,
  random = ~ herd,
  family = "multinomial2",
  data = cbpp, verbose = FALSE,
  nitt = 50E3,
prior=prior1)

mc3c<-MCMCglmm(cbind(incidence, size - incidence) ~ period,
  random = ~ herd,
  family = "multinomial2",
  data = cbpp, verbose = FALSE,
  nitt = 50E3,
prior=prior2)

The posteriors for the variances are skewed making the comparison  
between the posterior mean (i.e. as shown in summary(mc3b)) and the  
REML mode difficult. However, a plot shows the correspondence nicely:

plot(cbind(mc3b$VCV), pch=19, cex=0.2)
points(cbind(mc3c$VCV), pch=19, cex=0.2, col="green")
points(0.033835, 0.794023, col="red", pch=19)

Cheers,

Jarrod

Quoting "ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be>:

> Dear Bob,
>
> Notice that MCMCglmm always estimates 'units' to take overdispersion
> into account. So the glmer model and the MCMCglmm model are not the
> same.
>
> Furthermore, you did not specify the priors. I'm not sure if the default
> work well for all types of models.
>
> HTH,
>
> Thierry
>
>
> ------------------------------------------------------------------------
> ----
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to
> say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of
> data.
> ~ John Tukey
>
>
>> -----Oorspronkelijk bericht-----
>> Van: r-sig-mixed-models-bounces at r-project.org
>> [mailto:r-sig-mixed-models-bounces at r-project.org] Namens Bob Farmer
>> Verzonden: maandag 20 september 2010 14:18
>> Aan: David Duffy; r-sig-mixed-models at r-project.org
>> Onderwerp: Re: [R-sig-ME] MCMCglmm for binomial models?
>>
>> Thanks for the reply.  Unfortunately, I'm not sure that
>> that's the whole story.  Upping the iterations to achieve an
>> effective sample size of 1000 (output pasted below) still
>> leads to some differences between the two models (total
>> runtime on a quad-core Windows box is under 2 minutes, in
>> case you want to try).  In fact, running it a second time
>> shooting for an n.eff of 2000 (thin = 325) leads to even more
>> divergent estimates (e.g. mean.int == -72.96).  Again,
>> assuming that the glmer() output in this case is a `gold
>> standard', maybe there's another puzzle piece missing?
>>
>> Thanks again.
>> --Bob
>>
>>
>> #################now binomial from a reference dataset
>> gm3 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
>>               family = binomial, data = cbpp)
>> mc3<-MCMCglmm(cbind(incidence, size - incidence) ~ period,
>>   random = ~ herd,
>>   family = "multinomial2",
>>   data = cbpp, verbose = FALSE,
>>   nitt = 750E3, thin = 650, burnin = 100E3
>> )
>> summary(gm3)@coefs
>> summary(mc3)
>>
>> Leads to:
>>
>> > summary(gm3)@coefs
>>               Estimate Std. Error   z value     Pr(>|z|)
>> (Intercept) -1.3985351  0.2278906 -6.136871 8.416284e-10
>> period2     -0.9923347  0.3053852 -3.249452 1.156274e-03
>> period3     -1.1286754  0.3260491 -3.461673 5.368286e-04
>> period4     -1.5803739  0.4288037 -3.685542 2.282169e-04
>>
>> Versus:
>>
>> > summary(mc3)
>>
>>  Iterations = 749351
>>  Thinning interval  = 100001
>>  Sample size  = 1000
>> ....
>>             post.mean l-95% CI u-95% CI eff.samp pMCMC
>> (Intercept)   -1.6102  -2.2092  -0.8357     1000 0.002 **
>> period2       -1.2476  -2.2146  -0.1284     1000 0.020 *
>> period3       -1.3743  -2.3387  -0.2746     1000 0.008 **
>> period4       -1.9677  -3.2814  -0.8065     1000 0.008 **
>> ---
>> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
> Druk dit bericht a.u.b. niet onnodig af.
> Please do not print this message unnecessarily.
>
> Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
> en binden het INBO onder geen enkel beding, zolang dit bericht niet   
> bevestigd is
> door een geldig ondertekend document. The views expressed in  this message
> and any annex are purely those of the writer and may not be regarded  
>  as stating
> an official position of INBO, as long as the message is not   
> confirmed by a duly
> signed document.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.