[R] GLMMs fitted with lmer (R) & glimmix (SAS
Andrea Previtali
aprevitali at hotmail.com
Fri Jan 11 21:26:03 CET 2008
Thanks for your responses.
I knew that when you include an interaction term in a model you must include
the main effects of each of the factors. Therefore, I assumed that SAS will
do that by default. In most statistical packages, as in R, the main effects
are automatically added when you include an interaction. But after reading
your response I became suspicious and, it turns out, that SAS was indeed not
including the main effects. To make sure that the main effects are part of
the model in SAS you need to specify: sex eli sex*eli or use sex|eli. When I
did this the estimates for the interaction terms now match closely those
from R and, yes, the interaction between sex and infection status is now non
significant.
Here is the relevant part of the new SAS output:
Model Information
Variance Matrix Blocked By Site
Estimation Technique: Residual PL
Degrees of Freedom Method: Containment
Fit Statistics
-2 Res Log Pseudo-Likelihood: 17868.73
Pseudo-AIC: 17890.73
Pseudo-BIC: 17957.14
Covariance Parameter Estimates
Cov Parm Subject Estimate Std Error
Intercept Site 0.2975 0.1799
Solutions for Fixed Effects
Effect SEX ELI DW DIST SEA Estimate Std. Error DF t
Value Pr > |t|
Intercept -1.1427
0.4611 17 -2.48 0.0240
SEX 0 -0.6064
0.1673 3077 -3.62 0.0003
SEX 1 0
. . . .
ELI 0 -0.1905
0.2197 3077 -0.87 0.3858
ELI 1 0
. . . .
SEX*ELI 0 0 -0.4664 0.4617
3077 -1.01 0.3125
SEX*ELI 0 1 0 .
. . .
SEX*ELI 1 0 0 .
. . .
SEX*ELI 1 1 0 .
. . .
DW 0 -0.3308
0.1760 3077 -1.88 0.0603
DW 1 0
. . . .
DIST 0 -0.1185
0.3816 3077 -0.31 0.7562
DIST 1 0
. . . .
DW*DIST 0 0 -1.0148 0.4064
3077 -2.50 0.0126
DW*DIST 0 1 0 .
. . .
DW*DIST 1 0 0 .
. . .
DW*DIST 1 1 0 .
. . .
SEAS 0 -0.7839
0.1588 3077 -4.94 <.0001
SEAS 1 0
. . . .
DEN -0.01343
0.002588 3077 -5.19 <.0001
WT 0.00758
0.01912 3077 0.40 0.6918
And here is again the R output using lmer fro easy comparison:
Generalized linear mixed model fit using PQL
Formula: SURV ~ SEX * ELI + DW * DIST + SEAS + DEN + WT + (1 | SITE)
Family: binomial(logit link)
AIC BIC logLik deviance
1539 1606 -758.7 1517
Random effects:
Groups Name Variance Std.Dev.
SITE (Intercept) 0.27816 0.52741
number of obs: 3104, groups: SITE, 19
Estimated scale (compare to 1 ) 0.9458749
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.144259 0.458672 -2.495 0.012606
SEX -0.606026 0.167289 -3.623 0.000292 ***
ELI -0.190757 0.219599 -0.869 0.385034
DW -0.328796 0.175882 -1.869 0.061565 .
DIST -0.117745 0.374148 -0.315 0.752989
SEAS -0.784971 0.158748 -4.945 7.62e-07 ***
DEN -0.013381 0.002585 -5.176 2.27e-07 ***
WT 0.007735 0.019115 0.405 0.685732
SEX:ELI -0.466425 0.461596 -1.010 0.312274
DW:DIST -1.015454 0.404683 -2.509 0.012099 *
Now it is strange that after I fitted the model correctly the estimate for
the random factor did not change, as well as the fit statistics. How can
this be?
Regarding the question of What is the Pseudo-Likelihood that is part of the
output in SAS and not in R?
The manual for the glimmix procedure (which can be found here
http://support.sas.com/rnd/app/da/glimmix.html
http://support.sas.com/rnd/app/da/glimmix.html ) says that "For a model
containing random effects, the GLIMMIX procedure, by default, estimates the
parameters by applying pseudo-likelihood techniques as in Wolfinger and
O’Connell (1993) and Breslow and Clayton (1993)." That is, by default SAS
uses the PQL method that as David Buffy said, it is just an approximation.
The procedure involves a series of optimizations obtained via iterative
estimation methods based on linearizations (using Taylor series expansions).
After each optimization, a new pseudo-model is constructed for the mean
response. All the fit statistics (AIC, BIC, etc) that SAS reports are
calculated from the likelihood of the final "pseudo" model, thus the term
"pseudo-likelihood." The problem is that then these criteria cannot be used
to compare models; which is too bad because, at least in my field, it is
preferable to use information theory than p-values based on arbitrary set
significance levels (see Anderson Burnham and Thompson, 2000).
What are the likelihood values that we obtain when using lmer in R? Can they
be used to compare models?
--
View this message in context: http://www.nabble.com/GLMMs-fitted-with-lmer-%28R%29---glimmix-%28SAS%29-tp14623190p14764018.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list