[R-sig-ME] glmmPQL Help: Random Effect and Dispersion Parameter

Mon Jun 27 00:12:16 CEST 2011

A few comments on the GLMM simulation example.

First, the simulated data use only a single value as an error. Thus, the line
>  y <- exp(mu+a+b)+rnorm(1,0,sigma.e)
was probably meant to be:
>  y <- exp(mu+a+b)+rnorm(n*m*k,0,sigma.e)

Second, usually this kind of function is used if y can only assume
positive values.
If this was intended for this example, the error should be included in
the exp(). Thus,
>  y <- exp(mu+a+b+rnorm(n*m*k,0,sigma.e) )

Third, with these modifications, the following two calls recover the
input parameters nicely:
> library(lme4a)
> print(m1a <- lmer(y ~ 1 + (1|alpha) + (1|beta), family=gaussian(link="log"), data=simu),
> print(m2a <- lmer(log(y2) ~ 1 + (1|alpha) + (1|beta), data=simu), cor=F)
> detach(package:lme4a)
Unfortunately, as Douglas Bates pointed out, the residual parameter is
not returned in m1a.

Fourth, there appears to be an inconsistency when the two models are
fitted with lmer of lme4.
> library(lme4)
> print(m1 <- lmer(y ~ 1 + (1|alpha) + (1|beta), family=gaussian(link="log"), data=simu),
> print(m2 <- lmer(log(y2) ~ 1 + (1|alpha) + (1|beta), data=simu), cor=F)
> detach(package:lme4)
In m1, parameters are not returned, at least not in the expected
metric, as in m1a.
(MLs, deviances are very similar for m1 and m1a.) In m2, parameters
are returned as in m2a.

I attach an output illustrating the differences..

Reinhold Kliegl

On Sat, Jun 25, 2011 at 6:53 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> This time with the enclosure :-)
>
>
> On Sat, Jun 25, 2011 at 11:52 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
>> On Fri, Jun 24, 2011 at 2:28 PM, Yue Yu <parn.yy at gmail.com> wrote:
>>> Thanks a lot, Dennis.
>>>
>>> The reason I am not using lme4 is that its estimated variance
>>> component seems not desirable.
>>>
>>> If you use my simulated data and run the following line
>>>
>>> glm2 <- glmer(y~1+(1|alpha)+(1|beta),family=gaussian(link="log"),data=simu)
>>>
>>> the result will be
>>>
>>> Generalized linear mixed model fit by the Laplace approximation
>>> Formula: y ~ 1 + (1 | alpha) + (1 | beta)
>>>   Data: simu
>>>  AIC   BIC logLik deviance
>>>  508 529.2 -250      500
>>> Random effects:
>>>  Groups   Name        Variance   Std.Dev.
>>>  alpha    (Intercept) 0.01957449 0.139909
>>>  beta     (Intercept) 0.00080326 0.028342
>>>  Residual             0.06888192 0.262454
>>> Number of obs: 1500, groups: alpha, 100; beta, 5
>>>
>>> Fixed effects:
>>>            Estimate Std. Error t value
>>> (Intercept)  1.13506    0.01907   59.52
>>>
>>> The Std.Dev of alpha, beta and residual is far way from the true value
>>> in my simulation. While glmmPQL will give a better result.
>>
>> Hmm, in lme4a the results, shown in the enclosed, are much closer to
>> the values used for simulation.  However, this example does show a
>> deficiency in this implementation of glmer in that the estimate of the
>> residual standard deviation is not calculated and not shown here.
>>
>>> But I still need to find the variance matrix for variance components
>>> and the dispersion parameter, any suggestions?
>>
>> The best suggestion is don't do it.  Estimates of variance components
>> are not symmetrically distributed (think of the simplest case of the
>> estimator of the variance from an i.i.d Gaussian sample, which has a
>> chi-squared distribution).
>>
>>> On Fri, Jun 24, 2011 at 09:52, Dennis Murphy <djmuser at gmail.com> wrote:
>>>> Hi:
>>>>
>>>> glmmPQL has been around a while, and I suspect it was not meant to
>>>> handle crossed random effects. This was one of the original
>>>> motivations for the lme4 package, and it seems to work there, although
>>>> it's using Gauss-Hermite approximations to the likelihood rather than
>>>> PQL:
>>>>
>>>> library(lme4)
>>>> mod1 <- lmer(y ~ 1 + (1 | beta) + (1 | alpha), data = simu)
>>>>> summary(mod1)
>>>> Linear mixed model fit by REML ['summary.mer']
>>>> Formula: y ~ 1 + (1 | beta) + (1 | alpha)
>>>>   Data: simu
>>>> REML criterion at convergence: 584.4204
>>>>
>>>> Random effects:
>>>>  Groups   Name        Variance Std.Dev.
>>>>  alpha    (Intercept) 3.11128  1.7639
>>>>  beta     (Intercept) 0.17489  0.4182
>>>>  Residual             0.05405  0.2325
>>>> Number of obs: 1500, groups: alpha, 100; beta, 5
>>>>
>>>> Fixed effects:
>>>>            Estimate Std. Error t value
>>>> (Intercept)   2.9167     0.2572   11.34
>>>>
>>>> Hopefully that's closer to what you had in mind. If not, take a look
>>>> at Ben Bolker's GLMM wiki:
>>>>
>>>> http://glmm.wikidot.com/faq
>>>>
>>>> BTW, thank you for the nice reproducible example.
>>>>
>>>> Dennis
>>>>
>>>>
>>>> On Thu, Jun 23, 2011 at 9:16 PM, Yue Yu <parn.yy at gmail.com> wrote:
>>>>> Dear R users,
>>>>>
>>>>> I am currently doing a project in generalized mixed model, and I find
>>>>> the R function glmmPQL in MASS can do this via PQL. But I am a newbie
>>>>> in R, the input and output of glmmPQL confused me, and I wish I can
>>>>> get some answers here.
>>>>>
>>>>> The model I used is a typical two-way generalized mixed model with
>>>>> random subject (row) and block (column) effects and log link function,
>>>>> y_{ij} = exp(\mu+\alpha_i+\beta_j)+\epsilon.
>>>>>
>>>>> I can generate a pseudo data by the following R code
>>>>>
>>>>> ===================================================
>>>>> k <- 5; # Number of Blocks (Columns)
>>>>> n <- 100; # Number of Subjects (Rows)
>>>>> m <- 3; # Number of Replications in Each Cell
>>>>>
>>>>> sigma.a <- 0.5; # Var of Subjects Effects
>>>>> sigma.b <- 0.1; # Var of Block Effects
>>>>> sigma.e <- 0.01; # Var of Errors
>>>>> mu <- 1; # Overall mean
>>>>>
>>>>> a <- rep(rnorm(n,0,sigma.a),each=k*m);
>>>>> b <- rep(rep(rnorm(k,0,sigma.b),each=m),n);
>>>>>
>>>>> # Simulate responses y_{ij}=exp(\mu+\alpha_i+\beta_j)+e
>>>>> y <- exp(mu+a+b)+rnorm(1,0,sigma.e);
>>>>>
>>>>> # Indicator vector of subject effects alpha
>>>>> alpha <- rep(seq(1,n),each=k*m);
>>>>>
>>>>> # Indicator vector of block effects beta
>>>>> beta <- rep(rep(seq(1:k),each=m),n);
>>>>>
>>>>> simu <- data.frame(y,beta,alpha)
>>>>> ===================================================
>>>>>
>>>>> And I want to use glmmPQL to estimate the mean and variance
>>>>> components, but I have several questions.
>>>>>
>>>>> 1. How to write the random effect formula?
>>>>> I have tried
>>>>> glm <- glmmPQL(y~1,random=~alpha+beta,family=gaussian(link="log"),data=simu);
>>>>> but it did not work and got the error message "Invalid formula for groups".
>>>>>
>>>>> And the command
>>>>> glm <- glmmPQL(y~1,random=~1|alpha/beta,family=gaussian(link="log"),data=simu)
>>>>> worked, but the result was the nested "beta %in% alpha" variances,
>>>>> which was not what I want.
>>>>>
>>>>> 2. How to find the estimated variance-covariance matrix for the
>>>>> variance components, which should be the inverse of information
>>>>> matrix.
>>>>> I notice the output variable glm at apVar will give a similar
>>>>> variance-covariance matrix, but it has the prefix "reStruct." and
>>>>> attribute "Pars", what are these stand for? I can't find any
>>>>> explanation in the help document.
>>>>>
>>>>> 3. I am also wondering if there is a way to calculate the dispersion
>>>>> parameter or not?
>>>>>
>>>>> Anyone has some tips? Any suggestions will be greatly appreciated.
>>>>>
>>>>> Best,
>>>>>
>>>>> Yue Yu
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
-------------- next part --------------

R version 2.13.0 (2011-04-13)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[R.app GUI 1.40 (5751) x86_64-apple-darwin9.8.0]

> setwd('/Users/kliegl/documents/R-2010')
> # Modification of Yue Yu's GLMM simulation with crossed-random factors
> rm(list=ls())
> k <- 20    # Number of Blocks (Columns)
> n <- 100   # Number of Subjects (Rows)
> m <- 20   # Number of Replications in Each Cell
> 
> sigma.a <- 0.5  # SD of Subjects Effects
> sigma.b <- 0.1  # SD of Block Effects
> sigma.e <- 0.01  # SD of Errors
> mu <- 1         # Overall mean
> 
> subject <- rnorm(n, 0, sigma.a)
> block   <- rnorm(k, 0, sigma.b)
> a <- rep(subject, each=k*m)
> b <- rep(rep(block, each=m), n)
> e <- rnorm(k*m*n, 0, sigma.e)
> 
> # Simulate responses y_{ij}= \mu+\alpha_i+\beta_j + e
> y0 <-  mu + a + b + e
> 
> # Simulate responses y_{ij}=exp(\mu+\alpha_i+\beta_j) + e
> y1 <- exp(mu+a+b) + e
> 
> # Simulate responses y_{ij}=exp(\mu+\alpha_i+\beta_j + e)
> y2 <- exp(mu+a+b  + e)
> 
> # Indicator vector of subject effects alpha
> alpha <- rep(seq(1,n), each=k*m)
> 
> # Indicator vector of block effects beta
> beta <- rep(rep(seq(1:k), each=m), n)
> 
> simu <- data.frame(y0, y1, y2, beta, alpha)
> 
> library(lme4)
Loading required package: Matrix
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

    det

Attaching package: 'lme4'

The following object(s) are masked from 'package:stats':

    AIC, BIC

> print(m1 <- lmer(y0 ~ 1 + (1|alpha) + (1|beta), data=simu), cor=F)
Linear mixed model fit by REML 
Formula: y0 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
     AIC     BIC logLik deviance REMLdev
 -253106 -253072 126557  -253118 -253114
Random effects:
 Groups   Name        Variance   Std.Dev.
 alpha    (Intercept) 0.23916183 0.489042
 beta     (Intercept) 0.01041611 0.102059
 Residual             0.00010043 0.010022
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.06616    0.05397   19.76
> print(m2 <- lmer(y1 ~ 1 + (1|alpha) + (1|beta), family=gaussian(link="log"), data=simu), cor=F)
Generalized linear mixed model fit by the Laplace approximation 
Formula: y1 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
   AIC   BIC logLik deviance
 908.6 942.9 -450.3    900.6
Random effects:
 Groups   Name        Variance  Std.Dev. 
 alpha    (Intercept) 0.0007268 0.0269592
 beta     (Intercept) 0.0000316 0.0056214
 Residual             0.0030743 0.0554466
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept) 1.066542   0.002977   358.3
> print(m3 <- lmer(y2 ~ 1 + (1|alpha) + (1|beta), family=gaussian(link="log"), data=simu), cor=F)
Generalized linear mixed model fit by the Laplace approximation 
Formula: y2 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
   AIC   BIC logLik deviance
 958.9 993.2 -475.4    950.9
Random effects:
 Groups   Name        Variance   Std.Dev. 
 alpha    (Intercept) 1.0240e-03 0.0319997
 beta     (Intercept) 4.4498e-05 0.0066707
 Residual             4.3307e-03 0.0658077
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept) 1.066641   0.003533   301.9
> print(m4 <- lmer(log(y2) ~ 1 + (1|alpha) + (1|beta), data=simu), cor=F)
Linear mixed model fit by REML 
Formula: log(y2) ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
     AIC     BIC logLik deviance REMLdev
 -253106 -253072 126557  -253118 -253114
Random effects:
 Groups   Name        Variance   Std.Dev.
 alpha    (Intercept) 0.23916709 0.489047
 beta     (Intercept) 0.01041841 0.102071
 Residual             0.00010044 0.010022
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.06616    0.05396   19.76
> detach(package:lme4)
> 
> library(lme4a)
Loading required package: minqa
Loading required package: Rcpp
Loading required package: MatrixModels

Attaching package: 'lme4a'

The following object(s) are masked from 'package:stats':

    AIC, BIC

> print(m1a <- lmer(y0 ~ 1 + (1|alpha) + (1|beta), data=simu), cor=F)
Linear mixed model fit by REML ['merMod']
Formula: y0 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
REML criterion at convergence: -253114 

Random effects:
 Groups   Name        Variance  Std.Dev.
 alpha    (Intercept) 0.2391445 0.48902 
 beta     (Intercept) 0.0104174 0.10207 
 Residual             0.0001004 0.01002 
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.06616    0.05397   19.76
> print(m2a <- lmer(y1 ~ 1 + (1|alpha) + (1|beta), family=gaussian(link="log"), data=simu), cor=F)
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: gaussian 
Formula: y1 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
      AIC       BIC    logLik  deviance 
 906.5606  932.3505 -450.2803  900.5606 

Random effects:
 Groups Name        Variance Std.Dev.
 alpha  (Intercept) 0.23625  0.4861  
 beta   (Intercept) 0.01028  0.1014  
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error z value
(Intercept)  1.06688    0.05368   19.88
> print(m3a <- lmer(y2 ~ 1 + (1|alpha) + (1|beta), family=gaussian(link="log"), data=simu), cor=F)
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: gaussian 
Formula: y2 ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
      AIC       BIC    logLik  deviance 
 956.8503  982.6402 -475.4251  950.8503 

Random effects:
 Groups Name        Variance Std.Dev.
 alpha  (Intercept) 0.23629  0.4861  
 beta   (Intercept) 0.01027  0.1014  
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error z value
(Intercept)  1.06698    0.05368   19.88
> print(m4a <- lmer(log(y2) ~ 1 + (1|alpha) + (1|beta), data=simu), cor=F) 
Linear mixed model fit by REML ['merMod']
Formula: log(y2) ~ 1 + (1 | alpha) + (1 | beta) 
   Data: simu 
REML criterion at convergence: -253114 

Random effects:
 Groups   Name        Variance  Std.Dev.
 alpha    (Intercept) 0.2391436 0.48902 
 beta     (Intercept) 0.0104175 0.10207 
 Residual             0.0001004 0.01002 
Number of obs: 40000, groups: alpha, 100; beta, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1.06616    0.05397   19.76
> detach(package:lme4a)
> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] MatrixModels_0.2-1 minqa_1.1.15       Rcpp_0.9.4         Matrix_0.999375-50
[5] lattice_0.19-23   

loaded via a namespace (and not attached):
[1] codetools_0.2-8   grid_2.13.0       lme4_0.999375-39  lme4a_0.999375-65
[5] nlme_3.1-100      splines_2.13.0    stats4_2.13.0