[R] Questions on the results from glmmPQL(MASS)

Mon Dec 8 15:29:14 CET 2008

Ben Bolker's response to a glmmPQL question below raises a question. 

Does the issue of bias with binomial data reported by Breslow (2003)
remain valid with respect specifically to Ripley's treatment of PQL in
glmmPQL? Breslow makes no reference to this particular implementation.
He does discuss that of SAS GLIMMIX, but it does not work exactly as
glmmPQL. I've compared results between binomial models between these two
approaches, and they usually give compatible results. But they can
diverge markedly in enough cases that I wish I understood just how they
differ, so wonder if relative vulnerability to bias could be involved.

BTW Ben refers Zhijie to a separate user group that focuses on mixed
models. I knew nothing of this group. Following through on the link I
found their archive, which included a fairly extensive thread on a
question I posed to the regular R group in October. My question was
forwarded, by Ben Bolker in fact (Wald F tests thread), for which I'm
grateful. But I'm embarassed to say I only learned of the thread, even
though I initiated it, because of this email. I just assumed no
responses, other than R-News, and that was mostly questions to me about
glmmPQL, rather than attempts to answer my own question. I'm clearly not
the only one unaware of the mixed-models group, and a very sad choice
for asking questions about glmmPQL.

>	Mark Fowler
		Population Ecology Division
>	Bedford Inst of Oceanography
>	Dept Fisheries & Oceans
>	Dartmouth NS Canada
		B2Y 4A2
		Tel. (902) 426-3529
		Fax (902) 426-9710
		Email fowlerm at mar.dfo-mpo.gc.ca
		Home Tel. (902) 461-0708
		Home Email mark.fowler at ns.sympatico.ca

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Ben Bolker
Sent: December 6, 2008 4:10 PM
To: r-help at r-project.org
Subject: Re: [R] Questions on the results from glmmPQL(MASS)

zhijie zhang wrote:
> 
> Dear Rusers,
>    I have used R,S-PLUS and SAS to analyze the sample data "bacteria" 
> in MASS package. Their results are listed below.
> I have three questions, anybody can give me possible answers?
> Q1:From the results, we see that R get 'NAs'for AIC,BIC and logLik, 
> while S-PLUS8.0 gave the exact values for them. Why?
> 

This is a philosophical difference between S-PLUS and R.
Since glmmPQL uses quasi-likelihood, technically there is no
log-likelihood (hence no AIC nor BIC, which are based on the
log-likelihood) for this model -- the argument is that one is limited to
looking at Wald tests (testing the Z- or t-statistics, i.e. parameter
estimates divided by estimated standard errors) for inference in this
case.

zhijie zhang wrote:
> 
> Q2: The model to analyse the data is logity=b0+u+b1*trt+b2*I(week>2), 
> but the results for Random effects in R/SPLUS confused me. SAS may be
clearer.
> Random effects:
>  Formula: ~1 | ID
>                (Intercept)          Residual
> StdDev:    1.410637             0.7800511
>   Which is the random effect 'sigma'? I think it is "1.410637", but 
> what does "0.7800511" mean? That is, i want ot know how to explain/use

> the above two data for Random effects.
> 

The (Intercept) random effect is the variance in intercept across
grouping factors .
The residual (0.78) is (I believe) the individual-level error estimated
for the underlying linear mixed model -- you can probably ignore this.

zhijie zhang wrote:
> 
> Q3:In SAS and other softwares, we can get *p*-values for the random 
> effect 'sigma', but i donot see the *p*-values in the results of 
> R/SPLUS. I have used attributes() to look for them, but no *p* values.

> Anybody knows how to get *p*-values for the random effect 'sigma',.
>   Any suggestions or help are greatly appreciated.
> #R Results:MASS' version 7.2-44; R version 2.7.2
> library(MASS)
> summary(glmmPQL(y ~ trt + I(week > 2), random = ~ 1 | ID,family = 
> binomial, data = bacteria))
> 
> Linear mixed-effects model fit by maximum likelihood
>  Data: bacteria
>   AIC BIC logLik
>    NA  NA     NA
> 
> Random effects:
>  Formula: ~1 | ID
>         (Intercept)  Residual
> StdDev:    1.410637 0.7800511
> 
> Variance function:
>  Structure: fixed weights
>  Formula: ~invwt
> Fixed effects: y ~ trt + I(week > 2)
>                     Value          Std.Error      DF       t-value
> p-value
> (Intercept)      3.412014    0.5185033   169       6.580506  0.0000
> trtdrug         -1.247355     0.6440635     47      -1.936696  0.0588
> trtdrug+        -0.754327    0.6453978     47      -1.168779  0.2484
> I(week > 2)TRUE -1.607257 0.3583379 169     -4.485311  0.0000
>  Correlation:
>                 (Intr) trtdrg trtdr+
> trtdrug         -0.598
> trtdrug+        -0.571  0.460
> I(week > 2)TRUE -0.537  0.047 -0.001
> 
> #S-PLUS8.0: The results are the same as R except the followings:
> AIC      BIC    logLik
> 1113.622 1133.984 -550.8111
> 
> #SAS9.1.3
> proc nlmixed data=b;
>  parms b0=-1 b1=1 b2=1 sigma=0.4;
>  yy=b0+u+b1*trt+b2*week;
>  p=1/(1+exp(-yy));
>  Model response~binary(p);
>  Random u~normal(0,sigma) subject=id;
> Run;
> 
>  -2 Log Likelihood = 192.2
>  AIC (smaller is better)=200.2
> AICC (smaller is better) =200.3
> BIC (smaller is better)= 207.8
> 
> 
>                                       Parameter Estimates
> 
>                        Standard
>   Parameter  Estimate     Error    DF  t Value  Pr > |t|   Alpha
> Lower     Upper  Gradient
> 
>   b0             3.4966    0.6512    49     5.37    <.0001    0.05
> 2.1880    4.8052  -4.69E-6
>   trt              -0.6763    0.3352    49    -2.02    0.0491    0.05
> -1.3500  -0.00266  -0.00001
>   I(week>2)   -1.6132    0.4785    49    -3.37    0.0015    0.05
-2.5747
> -0.6516  -9.35E-7
>   sigma        1.5301    0.9632    49     1.59    0.1186    0.05
> -0.4054    3.4656  -2.42E-6
> 

 In general (alas) it is *extremely* difficult to get *correct* p-values
for effects (both fixed and random, although fixed might be even worse)
in GLMMs, despite the fact that SAS will happily give them to you.  In
general you can get p-values for random effects via a likelihood ratio
test on the difference of nested models with and without the relevant
effects.  In this case that's a little bit trickier because (1) glmmPQL
won't give you log-likelihoods
(2) glmmPQL won't fit models without any random effects at all, and
comparing log-likelihoods across different fitting procedures is always
a little risky -- you have to make sure they are including the same
constants in the log-likelihood.
A partial solution is to use the lmer function in the lme4 package:

 lme2 <- lmer(y ~ trt + I(week > 2)+(1|ID), family = binomial, data =
bacteria)

which gives you similar estimates (a good idea in any case, because in
any case PQL is not reliable for binary data -- see Breslow 2003).
This will give you a likelihood etc. for the model.  You still need to
work out whether comparing AIC/BIC log-likelihood between lmer and glm
(which you need to fit the model without random effects) is sensible.

  I would strongly recommend that you follow up further questions on
this topic to r-sig-mixed-models at r-project.org , which is a special
mailing list for mixed models.

  good luck,
   Ben Bolker

--
View this message in context:
http://www.nabble.com/Questions-on-the-results-from-glmmPQL%28MASS%29-tp
20867935p20874037.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.