[R-sig-ME] different aic and LL in glmer(lme4) and glimmix(SAS)?
Douglas Bates
bates at stat.wisc.edu
Thu Jul 1 20:16:13 CEST 2010
On Thu, Jul 1, 2010 at 11:24 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
> On Thu, Jul 1, 2010 at 11:03 AM, Jeffrey Evans
> <Jeffrey.Evans at dartmouth.edu> wrote:
>> Hello All,
>
>> I have read several posts related to this previously, but haven't found any
>> resolution yet. When running the same GLMM in glmer and in SAS PROC GLIMMIX,
>> both programs return comparable parameter estimates, but wildly different
>> likelihoods and AIC values.
>
>> In SAS I specify use of the Laplace approximation. In R, I believe this is
>> the default (no?).
>
>> What's the difference, and [how] can I reproduce the SAS -2ll in glmer?
>
> The difference is probably due to the way that the deviance is defined
> for the binomial family in R. A glm family object is a list of
> functions and expressions. One of the functions, called "dev.resids"
> has arguments y, mu and weights. You can specify the response for a
> binomial family as the 0/1 responses or as a matrix with two columns
> as you did here. When you use the two column specification the
> response y is transformed to the fraction of successes and the number
> of cases is incorporated in the weights. It turns out that this is
> all the information necessary for obtaining the mle's of the
> parameters but it does not give the same deviance as you would get by
> listing the 0/1 responses.
>
> I'll write an example using the cbpp data from the lme4 package.
I enclose the example I promised.
>> Thanks,
>> Jeff
>>
>> \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
>>> R_GLMM = glmer(cbind(SdlFinal, SdlMax-SdlFinal) ~ lnsdlmaxd*lnadultssdld +
>> (1|ID),data=sdlPCAdat,family="binomial")
>>> R_GLMM
>> Generalized linear mixed model fit by the Laplace approximation
>> Formula: cbind(SdlFinal, SdlMax - SdlFinal) ~ lnsdlmaxd * lnadultssdld +
>> (1 | ID)
>> Data: sdlPCAdat
>> AIC BIC logLik deviance
>> 1150 1165 -570 1140 <------------------ this line!!
>> Random effects:
>> Groups Name Variance Std.Dev.
>> ID (Intercept) 1.2491 1.1176
>> Number of obs: 144, groups: ID, 48
>>
>> Fixed effects:
>> Estimate Std. Error z value Pr(>|z|)
>> (Intercept) 4.56964 0.43148 10.591 < 2e-16 ***
>> lnsdlmaxd -0.65936 0.05686 -11.595 < 2e-16 ***
>> lnadultssdld -0.64534 0.15861 -4.069 4.73e-05 ***
>> lnsdlmaxd:lnadultssdld 0.07393 0.02166 3.414 0.00064 ***
>> ---
>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>> Correlation of Fixed Effects:
>> (Intr) lnsdlm lndlts
>> lnsdlmaxd -0.923
>> lnadltssdld -0.461 0.479
>> lnsdlmxd:ln 0.482 -0.508 -0.994
>>
>>
>> \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
>> title 'SAS GLMM';
>> proc glimmix data=sdlPCAdat ic=pq noitprint method=laplace;
>> class site id;
>> model sdlfinal/sdlmax = lnsdlmaxd|lnadultssdld/ solution dist=binomial;
>> random ID /;
>> covtest glm/wald;
>> run;
>>
>> //////////////////////////////////////////////////////////////////////
>>
>> SAS GLMM 19:36 Wednesday, June 30, 2010 88
>>
>> The GLIMMIX Procedure
>>
>> Data Set WORK.SDLPCADAT
>> Response Variable (Events) SdlFinal
>> Response Variable (Trials) SdlMax
>> Response Distribution Binomial
>> Link Function Logit
>> Variance Function Default
>> Variance Matrix Not blocked
>> Estimation Technique Maximum Likelihood
>> Likelihood Approximation Laplace
>> Degrees of Freedom Method Containment
>>
>>
>>
>> Optimization Information
>>
>> Optimization Technique Dual Quasi-Newton
>> Parameters in Optimization 5
>> Lower Boundaries 1
>> Upper Boundaries 0
>> Fixed Effects Not Profiled
>> Starting From GLM estimates
>>
>> Convergence criterion (GCONV=1E-8) satisfied.
>>
>> Fit Statistics
>>
>> -2 Log Likelihood 1653.90 <------------------ this
>> line!!
>> AIC (smaller is better) 1663.90
>> AICC (smaller is better) 1664.33
>> BIC (smaller is better) 1673.25
>> CAIC (smaller is better) 1678.25
>> HQIC (smaller is better) 1667.43
>>
>>
>> Fit Statistics for Conditional Distribution
>>
>> -2 log L(SdlFinal | r. effects) 1436.44
>> Pearson Chi-Square 908.07
>> Pearson Chi-Square / DF 6.31
>>
>>
>> Covariance Parameter Estimates
>>
>> Cov Standard Z
>> Parm Estimate Error Value Pr > Z
>>
>> ID 1.2491 0.2746 4.55 <.0001
>>
>>
>> Solutions for Fixed Effects
>>
>> Standard
>> Effect Estimate Error DF t Value Pr > |t|
>>
>> Intercept 4.5696 0.4333 47 10.55 <.0001
>> lnsdlmaxd -0.6594 0.05717 93 -11.53 <.0001
>> lnadultssdld -0.6453 0.1593 93 -4.05 0.0001
>> lnsdlmaxd*lnadultssd0.07394 0.02174 93 3.40 0.0010
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
-------------- next part --------------
R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(lme4a)
Loading required package: Matrix
Loading required package: lattice
Attaching package: 'Matrix'
The following object(s) are masked from 'package:base':
det
Loading required package: minqa
Loading required package: Rcpp
Attaching package: 'lme4a'
The following object(s) are masked from 'package:stats':
AIC
>
> ##' <description>
> ##' Expand the data frame for the matrix form of the binomial response
> ##' to the vector form of the response.
> ##' <details>
> ##' In the formula for the glm and glmer functions with family =
> ##' binomial the response can be specified as a matrix with two
> ##' columns, corresponding to numbers of successes and failures. The
> ##' parameter estimates for the model will be the same as those from
> ##' the vector response but the deviance itself is a different value.
> ##'
> ##' This function expands the model frame for the matrix response to
> ##' the equivalent frame for the vector response, replacing the
> ##' response matrix with a vector called y. of 1's and 0's
> ##' @title Expand binomial matrix response frame
> ##' @param fr model frame from a glm or glmer fitted model with
> ##' family=binomial and the matrix response specification.
> ##' @return expanded model frame with the response matrix replaced by
> ##' a vector of 1's and 0's.
> ##' @author Douglas Bates
> expandsu <- function(fr) {
+ stopifnot(is(fr, "data.frame"),
+ is(mm <- fr[[1]], "matrix"),
+ ncol(mm) == 2,
+ is.numeric(mm),
+ all(mm >= 0))
+ nr <- nrow(mm)
+ within(fr[, -1][rep.int(seq_len(nr), rowSums(mm)), ],
+ y. <- rep.int(rep.int(c(1,0), nr), as.vector(t(mm))))
+ }
>
> str(cbpp)
'data.frame': 56 obs. of 4 variables:
$ herd : Factor w/ 15 levels "1","2","3","4",..: 1 1 1 1 2 2 2 3 3 3 ...
$ incidence: num 2 3 4 0 3 1 1 8 2 0 ...
$ size : num 14 12 9 5 22 18 21 22 16 16 ...
$ period : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 1 2 3 ...
> (fm1 <- glmer(cbind(incidence, size - incidence) ~ 0 + period +
+ (1|herd), cbpp, binomial, verbose=1L))
npt = 3 , n = 1
rhobeg = 0.2 , rhoend = 2e-07
0.020: 4: 100.209;0.600000
0.0020: 7: 100.154;0.649390
0.00020: 10: 100.152;0.641932
2.0e-05: 12: 100.152;0.641823
2.0e-06: 13: 100.152;0.641823
2.0e-07: 15: 100.152;0.641815
At return
18: 100.15189: 0.641815
npt = 11 , n = 5
rhobeg = 0.5840305 , rhoend = 5.840305e-07
0.058: 11: 100.152;0.641815 -1.36047 -2.33665 -2.47155 -2.92015
0.0058: 18: 100.108;0.653041 -1.38121 -2.38874 -2.52675 -2.97967
0.00058: 31: 100.095;0.642982 -1.39633 -2.38831 -2.52555 -2.97656
5.8e-05: 48: 100.095;0.642327 -1.39893 -2.39134 -2.52748 -2.97945
5.8e-06: 57: 100.095;0.642392 -1.39894 -2.39126 -2.52758 -2.97924
5.8e-07: 66: 100.095;0.642393 -1.39894 -2.39125 -2.52758 -2.97924
At return
78: 100.09497: 0.642392 -1.39894 -2.39125 -2.52758 -2.97924
Generalized linear mixed model fit by maximum likelihood ['merMod']
Family: binomial
Formula: cbind(incidence, size - incidence) ~ 0 + period + (1 | herd)
Data: cbpp
AIC BIC logLik deviance
110.0950 120.2217 -50.0475 100.0950
Random effects:
Groups Name Variance Std.Dev.
herd (Intercept) 0.4127 0.6424
Number of obs: 56, groups: herd, 15
Fixed effects:
Estimate Std. Error z value
period1 -1.3989 0.2279 -6.138
period2 -2.3912 0.3103 -7.705
period3 -2.5276 0.3308 -7.641
period4 -2.9792 0.4327 -6.885
Correlation of Fixed Effects:
perid1 perid2 perid3
period2 0.389
period3 0.365 0.289
period4 0.280 0.220 0.205
> str(cbpp1 <- expandsu(model.frame(fm1)))
'data.frame': 842 obs. of 3 variables:
$ period: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
$ herd : Factor w/ 15 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
$ y. : num 1 1 0 0 0 0 0 0 0 0 ...
> (fm1a <- glmer(y. ~ 0 + period + (1|herd), cbpp1, binomial,
+ verbose=1L))
npt = 3 , n = 1
rhobeg = 0.2 , rhoend = 2e-07
0.020: 4: 555.118;0.600000
0.0020: 7: 555.062;0.649390
0.00020: 10: 555.060;0.641932
2.0e-05: 12: 555.060;0.641823
2.0e-06: 13: 555.060;0.641823
2.0e-07: 15: 555.060;0.641815
At return
17: 555.05991: 0.641815
npt = 11 , n = 5
rhobeg = 0.5840305 , rhoend = 5.840305e-07
0.058: 11: 555.060;0.641815 -1.36047 -2.33665 -2.47155 -2.92015
0.0058: 18: 555.016;0.653041 -1.38121 -2.38874 -2.52675 -2.97967
0.00058: 31: 555.003;0.642982 -1.39633 -2.38831 -2.52555 -2.97656
5.8e-05: 48: 555.003;0.642327 -1.39893 -2.39134 -2.52748 -2.97945
5.8e-06: 57: 555.003;0.642392 -1.39894 -2.39126 -2.52758 -2.97924
5.8e-07: 66: 555.003;0.642393 -1.39894 -2.39125 -2.52758 -2.97924
At return
76: 555.00300: 0.642392 -1.39894 -2.39125 -2.52758 -2.97924
Generalized linear mixed model fit by maximum likelihood ['merMod']
Family: binomial
Formula: y. ~ 0 + period + (1 | herd)
Data: cbpp1
AIC BIC logLik deviance
565.0030 588.6819 -277.5015 555.0030
Random effects:
Groups Name Variance Std.Dev.
herd (Intercept) 0.4127 0.6424
Number of obs: 842, groups: herd, 15
Fixed effects:
Estimate Std. Error z value
period1 -1.3989 0.2279 -6.138
period2 -2.3912 0.3103 -7.705
period3 -2.5276 0.3308 -7.641
period4 -2.9792 0.4327 -6.885
Correlation of Fixed Effects:
perid1 perid2 perid3
period2 0.389
period3 0.365 0.289
period4 0.280 0.220 0.205
>
>
> proc.time()
user system elapsed
6.140 0.130 6.405
More information about the R-sig-mixed-models
mailing list