[R-sig-ME] quasi-binomial family in lme4

Wed Nov 10 17:16:06 CET 2010

To fit a per-observation random effects term you need to expand the
binomial representation to a Bernoulli representation.  You should end
up with
> sum(cbpp$size)
[1] 842
rows, not 52.  Notice that you have 52 levels of herd:id and 52 levels
of id in this model.  Thus the two random effects are the same and the
variance components are not well-defined.

If the data were expanded to 842 observations with a binary response
the appropriate specification of the random effects term would be
(1|herd/id) or, more simply, (1|herd) + (1|id).

On Wed, Nov 10, 2010 at 3:53 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
> Hi,
>
> It seems worrying to me that the Herd:Id estimate and the Id estimate are
> exactly the same. I get a (slightly) different estimate with version
> 0.999375-35 on Linux, but again the estimates are identical. MCMCglmm
> fitting the same model gives a very different answer.
>
> Cheers,
>
> Jarrod
>
>
>
>> summary(gm2)
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: incidence ~ period + (1 | Id/herd)
>   Data: cbpp
>   AIC   BIC logLik deviance
>  102.2 114.4 -45.11    90.21
> Random effects:
>  Groups  Name        Variance Std.Dev.
>  herd:Id (Intercept) 0.29654  0.54456
>  Id      (Intercept) 0.29654  0.54456
> Number of obs: 56, groups: herd:Id, 56; Id, 56
>
> Fixed effects:
>            Estimate Std. Error z value Pr(>|z|)
> (Intercept)   1.1129     0.2477   4.492 7.04e-06 ***
> period2      -1.1969     0.4185  -2.860 0.004233 **
> period3      -1.4223     0.4381  -3.246 0.001169 **
> period4      -2.0049     0.5292  -3.788 0.000152 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> Correlation of Fixed Effects:
>        (Intr) perid2 perid3
> period2 -0.592
> period3 -0.565  0.335
> period4 -0.468  0.277  0.265
>
> Using a parameter expanded prior in order to improve mixing as the herd
> variance approaches zero:
>
> prior<-list(R=list(V=1, nu=0), G=list(G1=list(V=1, nu=1, alpha.mu=0,
> alpha.V=1000)))
>
> mcmc2<-MCMCglmm(incidence~period, random=~herd, family="poisson", data=cbpp)
>
>> summary(mcmc2)
>
>  Iterations = 12991
>  Thinning interval  = 3001
>  Sample size  = 1000
>
>  DIC: 175.9590
>
>  G-structure:  ~herd
>
>     post.mean  l-95% CI u-95% CI eff.samp
> herd   0.01433 1.067e-16   0.0713    61.58
>
>  R-structure:  ~units
>
>      post.mean l-95% CI u-95% CI eff.samp
> units    0.7347   0.1564    1.421    222.9
>
>  Location effects: incidence ~ period
>
>            post.mean l-95% CI u-95% CI eff.samp  pMCMC
> (Intercept)    1.0749   0.4584   1.5987    844.5  0.002 **
> period2       -1.2054  -2.0862  -0.3556    771.8  0.004 **
> period3       -1.4568  -2.4362  -0.4839    674.8  0.004 **
> period4       -2.0584  -3.1851  -0.9430    446.5 <0.001 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
>
> Quoting Gosselin Frederic <frederic.gosselin at cemagref.fr>:
>
>> Dear Colleague,
>>
>> good question:
>>
>> the commands are under R2.5.1:
>> ************************************************************************
>> library(lme4)
>> cbppbis<-cbind.data.frame(cbpp,Id=as.factor(1:dim(cbpp)[1]))
>> gm1 <- lmer(incidence ~ period + (1 | herd), family = quasipoisson, data =
>> cbpp)
>> summary(gm1)
>>
>> gm2 <- lmer(incidence ~ period + (1 | Id/herd), family = poisson, data =
>> cbppbis)
>> summary(gm2)
>> **************************************************************************
>> (fope it is declared in the good order for the random effects in gm2)
>>
>> The results do show a slight discrepancy between both methods:
>>
>> *************************************************************************
>>>
>>> summary(gm1)
>>
>> Generalized linear mixed model fit using Laplace
>> Formula: incidence ~ period + (1 | herd)
>>   Data: cbpp
>>  Family: quasipoisson(log link)
>>   AIC   BIC logLik deviance
>>  112.2 122.3 -51.11    102.2
>> Random effects:
>>  Groups   Name        Variance Std.Dev.
>>  herd     (Intercept) 0.35085  0.59233
>>  Residual             1.40470  1.18520
>> number of obs: 56, groups: herd, 15
>>
>> Fixed effects:
>>            Estimate Std. Error t value
>> (Intercept)   1.2812     0.2200   5.824
>> period2      -1.1240     0.3315  -3.391
>> period3      -1.3203     0.3579  -3.689
>> period4      -1.9477     0.4808  -4.051
>>
>> Correlation of Fixed Effects:
>>        (Intr) perid2 perid3
>> period2 -0.339
>> period3 -0.314  0.219
>> period4 -0.233  0.163  0.151
>>
>>
>>
>>
>>> summary(gm2)
>>
>> Generalized linear mixed model fit using Laplace
>> Formula: incidence ~ period + (1 | Id/herd)
>>   Data: cbppbis
>>  Family: poisson(log link)
>>   AIC   BIC logLik deviance
>>  102.2 114.4 -45.11    90.21
>> Random effects:
>>  Groups  Name        Variance Std.Dev.
>>  herd:Id (Intercept) 0.29608  0.54413
>>  Id      (Intercept) 0.29608  0.54413
>> number of obs: 56, groups: herd:Id, 56; Id, 56
>>
>> Estimated scale (compare to  1 )  0.9249959
>>
>> Fixed effects:
>>            Estimate Std. Error z value Pr(>|z|)
>> (Intercept)   1.1149     0.2476   4.503 6.69e-06 ***
>> period2      -1.2013     0.4184  -2.871 0.004089 **
>> period3      -1.4224     0.4378  -3.249 0.001159 **
>> period4      -2.0089     0.5294  -3.795 0.000148 ***
>> ---
>> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>> Correlation of Fixed Effects:
>>        (Intr) perid2 perid3
>> period2 -0.592
>> period3 -0.565  0.335
>> period4 -0.468  0.277  0.264
>> ************************************************************************
>>
>> The estimates and the standard errors are not exactly the same - which
>> might not be unlogical given that the relationship between variance and mean
>> is not the same in both models. They however are not very far one from the
>> other.
>>
>> Of course, this needs further investigation.
>>
>> I remember of a paper that motivated the use of the quasi-poisson method
>> based on empirical relationships between the residual variance and the mean:
>>
>> Ver Hoef J.M. et Boveng P.L., 2007, Quasi-poisson vs. negative binomial
>> regression: How should we model overdispersed count data?, Ecology, 88, 11,
>> p. 2766-2772.
>>
>> Sincerely,
>>
>> Frédéric
>>
>> -----Message d'origine-----
>> De : John Maindonald [mailto:john.maindonald at anu.edu.au]
>> Envoyé : mercredi 10 novembre 2010 09:51
>> À : Gosselin Frederic
>> Cc : r-sig-mixed-models at r-project.org; tiflo at csli.stanford.edu
>> Objet : Re: [R-sig-ME] quasi-binomial family in lme4
>>
>> I wonder if you have compared the results that you quote with the result
>> you get with observation level random effects in a poisson model.
>>
>> As I see it, use of observation level random effects should, unless there
>> is evidence that a multiplicative effect on the scale of the response is a
>> better fit, replace use of the quasi- models in glm() as well as in
>> generalised linear mixed models.
>>
>> John Maindonald             email: john.maindonald at anu.edu.au
>> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
>> Centre for Mathematics & Its Applications, Room 1194, John Dedman
>> Mathematical Sciences Building (Building 27) Australian National University,
>> Canberra ACT 0200.
>> http://www.maths.anu.edu.au/~johnm
>>
>> On 10/11/2010, at 6:39 PM, Gosselin Frederic wrote:
>>
>>> Hi Florian,
>>>
>>> a different perpsective on the quasi-likelihood debate - that comes out
>>> sporadically on this list:
>>>
>>> (i) I globally agree with the previous repliers that a fully
>>> probabilistic solution looks better - at least aesthetically - than a
>>> quasi-likelihood;
>>>
>>> (ii) however, as I have already mentioned on the list (cf. below),
>>> earlier versions of lme4 give much more sensible results than the latest
>>> versions:
>>> http://markmail.org/message/s4abxhhdacqjkunm
>>>
>>> This is why in the following papers:
>>> Elek Z., Dauffy-Richard & Gosselin F., 2010, Carabid species responses to
>>> hybrid poplar plantation in floodplains in France, Forest Ecology and
>>> Management, 260, 9, p. 1446-1455.
>>>
>>> and
>>>
>>> Vuidot A., Paillet Y., Archaux F. & Gosselin F. (In Press) Influence of
>>> tree characteristics and forest management on tree microhabitats in France,
>>> Biological Conservation.
>>>
>>> we used version the R version 2.5.1 and the associated lme4 version (here
>>> with quasi-poisson, not quasi-bionomial).
>>>
>>> Hope this helps.
>>>
>>> Sincerely,
>>>
>>> Frédéric Gosselin
>>> Engineer & Researcher (PhD) in Forest Ecology Cemagref Domaine des
>>> Barres F-45290 Nogent sur Vernisson France
>>>
>>> http://www.cemagref.fr/les-contacts/les-pages-personnelles-professionn
>>> elles/gosselin-frederic/english-short-scientific-cv
>>>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>