[R-sig-ME] glmmPQL Help: Random Effect and Dispersion Parameter

Mon Jun 27 00:35:11 CEST 2011

Dear Reinhold,

Thank you very much for pointing out my typo. Yes, I should use
rnorm(n*m*k,0,sigma.e) instead of rnorm(1,0,sigma.e).

The reason I not included the random error in the exponential function
is that I want to follow the idea of generalized linear model, which
assumes g(E(y))=X\beta+Zb, g is the link function and g=ln() in this
case.

It seems that lme4a is the best one to fit glimmix model while lme4
can not give out a desirable result. Thank all the responders.

Best,

Yue Yu

On Sun, Jun 26, 2011 at 17:12, Reinhold Kliegl
<reinhold.kliegl at gmail.com> wrote:
>
> A few comments on the GLMM simulation example.
>
> First, the simulated data use only a single value as an error. Thus, the line
> >  y <- exp(mu+a+b)+rnorm(1,0,sigma.e)
> was probably meant to be:
> >  y <- exp(mu+a+b)+rnorm(n*m*k,0,sigma.e)
>
> Second, usually this kind of function is used if y can only assume
> positive values.
> If this was intended for this example, the error should be included in
> the exp(). Thus,
> >  y <- exp(mu+a+b+rnorm(n*m*k,0,sigma.e) )
>
> Third, with these modifications, the following two calls recover the
> input parameters nicely:
> > library(lme4a)
> > print(m1a <- lmer(y ~ 1 + (1|alpha) + (1|beta), family=gaussian(link="log"), data=simu),
> > print(m2a <- lmer(log(y2) ~ 1 + (1|alpha) + (1|beta), data=simu), cor=F)
> > detach(package:lme4a)
> Unfortunately, as Douglas Bates pointed out, the residual parameter is
> not returned in m1a.
>
> Fourth, there appears to be an inconsistency when the two models are
> fitted with lmer of lme4.
> > library(lme4)
> > print(m1 <- lmer(y ~ 1 + (1|alpha) + (1|beta), family=gaussian(link="log"), data=simu),
> > print(m2 <- lmer(log(y2) ~ 1 + (1|alpha) + (1|beta), data=simu), cor=F)
> > detach(package:lme4)
> In m1, parameters are not returned, at least not in the expected
> metric, as in m1a.
> (MLs, deviances are very similar for m1 and m1a.) In m2, parameters
> are returned as in m2a.
>
> I attach an output illustrating the differences..
>
> Reinhold Kliegl
>
> On Sat, Jun 25, 2011 at 6:53 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> > This time with the enclosure :-)
> >
> >
> > On Sat, Jun 25, 2011 at 11:52 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
> >> On Fri, Jun 24, 2011 at 2:28 PM, Yue Yu <parn.yy at gmail.com> wrote:
> >>> Thanks a lot, Dennis.
> >>>
> >>> The reason I am not using lme4 is that its estimated variance
> >>> component seems not desirable.
> >>>
> >>> If you use my simulated data and run the following line
> >>>
> >>> glm2 <- glmer(y~1+(1|alpha)+(1|beta),family=gaussian(link="log"),data=simu)
> >>>
> >>> the result will be
> >>>
> >>> Generalized linear mixed model fit by the Laplace approximation
> >>> Formula: y ~ 1 + (1 | alpha) + (1 | beta)
> >>>   Data: simu
> >>>  AIC   BIC logLik deviance
> >>>  508 529.2 -250      500
> >>> Random effects:
> >>>  Groups   Name        Variance   Std.Dev.
> >>>  alpha    (Intercept) 0.01957449 0.139909
> >>>  beta     (Intercept) 0.00080326 0.028342
> >>>  Residual             0.06888192 0.262454
> >>> Number of obs: 1500, groups: alpha, 100; beta, 5
> >>>
> >>> Fixed effects:
> >>>            Estimate Std. Error t value
> >>> (Intercept)  1.13506    0.01907   59.52
> >>>
> >>> The Std.Dev of alpha, beta and residual is far way from the true value
> >>> in my simulation. While glmmPQL will give a better result.
> >>
> >> Hmm, in lme4a the results, shown in the enclosed, are much closer to
> >> the values used for simulation.  However, this example does show a
> >> deficiency in this implementation of glmer in that the estimate of the
> >> residual standard deviation is not calculated and not shown here.
> >>
> >>> But I still need to find the variance matrix for variance components
> >>> and the dispersion parameter, any suggestions?
> >>
> >> The best suggestion is don't do it.  Estimates of variance components
> >> are not symmetrically distributed (think of the simplest case of the
> >> estimator of the variance from an i.i.d Gaussian sample, which has a
> >> chi-squared distribution).
> >>
> >>> On Fri, Jun 24, 2011 at 09:52, Dennis Murphy <djmuser at gmail.com> wrote:
> >>>> Hi:
> >>>>
> >>>> glmmPQL has been around a while, and I suspect it was not meant to
> >>>> handle crossed random effects. This was one of the original
> >>>> motivations for the lme4 package, and it seems to work there, although
> >>>> it's using Gauss-Hermite approximations to the likelihood rather than
> >>>> PQL:
> >>>>
> >>>> library(lme4)
> >>>> mod1 <- lmer(y ~ 1 + (1 | beta) + (1 | alpha), data = simu)
> >>>>> summary(mod1)
> >>>> Linear mixed model fit by REML ['summary.mer']
> >>>> Formula: y ~ 1 + (1 | beta) + (1 | alpha)
> >>>>   Data: simu
> >>>> REML criterion at convergence: 584.4204
> >>>>
> >>>> Random effects:
> >>>>  Groups   Name        Variance Std.Dev.
> >>>>  alpha    (Intercept) 3.11128  1.7639
> >>>>  beta     (Intercept) 0.17489  0.4182
> >>>>  Residual             0.05405  0.2325
> >>>> Number of obs: 1500, groups: alpha, 100; beta, 5
> >>>>
> >>>> Fixed effects:
> >>>>            Estimate Std. Error t value
> >>>> (Intercept)   2.9167     0.2572   11.34
> >>>>
> >>>> Hopefully that's closer to what you had in mind. If not, take a look
> >>>> at Ben Bolker's GLMM wiki:
> >>>>
> >>>> http://glmm.wikidot.com/faq
> >>>>
> >>>> BTW, thank you for the nice reproducible example.
> >>>>
> >>>> Dennis
> >>>>
> >>>>
> >>>> On Thu, Jun 23, 2011 at 9:16 PM, Yue Yu <parn.yy at gmail.com> wrote:
> >>>>> Dear R users,
> >>>>>
> >>>>> I am currently doing a project in generalized mixed model, and I find
> >>>>> the R function glmmPQL in MASS can do this via PQL. But I am a newbie
> >>>>> in R, the input and output of glmmPQL confused me, and I wish I can
> >>>>> get some answers here.
> >>>>>
> >>>>> The model I used is a typical two-way generalized mixed model with
> >>>>> random subject (row) and block (column) effects and log link function,
> >>>>> y_{ij} = exp(\mu+\alpha_i+\beta_j)+\epsilon.
> >>>>>
> >>>>> I can generate a pseudo data by the following R code
> >>>>>
> >>>>> ===================================================
> >>>>> k <- 5; # Number of Blocks (Columns)
> >>>>> n <- 100; # Number of Subjects (Rows)
> >>>>> m <- 3; # Number of Replications in Each Cell
> >>>>>
> >>>>> sigma.a <- 0.5; # Var of Subjects Effects
> >>>>> sigma.b <- 0.1; # Var of Block Effects
> >>>>> sigma.e <- 0.01; # Var of Errors
> >>>>> mu <- 1; # Overall mean
> >>>>>
> >>>>> a <- rep(rnorm(n,0,sigma.a),each=k*m);
> >>>>> b <- rep(rep(rnorm(k,0,sigma.b),each=m),n);
> >>>>>
> >>>>> # Simulate responses y_{ij}=exp(\mu+\alpha_i+\beta_j)+e
> >>>>> y <- exp(mu+a+b)+rnorm(1,0,sigma.e);
> >>>>>
> >>>>> # Indicator vector of subject effects alpha
> >>>>> alpha <- rep(seq(1,n),each=k*m);
> >>>>>
> >>>>> # Indicator vector of block effects beta
> >>>>> beta <- rep(rep(seq(1:k),each=m),n);
> >>>>>
> >>>>> simu <- data.frame(y,beta,alpha)
> >>>>> ===================================================
> >>>>>
> >>>>> And I want to use glmmPQL to estimate the mean and variance
> >>>>> components, but I have several questions.
> >>>>>
> >>>>> 1. How to write the random effect formula?
> >>>>> I have tried
> >>>>> glm <- glmmPQL(y~1,random=~alpha+beta,family=gaussian(link="log"),data=simu);
> >>>>> but it did not work and got the error message "Invalid formula for groups".
> >>>>>
> >>>>> And the command
> >>>>> glm <- glmmPQL(y~1,random=~1|alpha/beta,family=gaussian(link="log"),data=simu)
> >>>>> worked, but the result was the nested "beta %in% alpha" variances,
> >>>>> which was not what I want.
> >>>>>
> >>>>> 2. How to find the estimated variance-covariance matrix for the
> >>>>> variance components, which should be the inverse of information
> >>>>> matrix.
> >>>>> I notice the output variable glm at apVar will give a similar
> >>>>> variance-covariance matrix, but it has the prefix "reStruct." and
> >>>>> attribute "Pars", what are these stand for? I can't find any
> >>>>> explanation in the help document.
> >>>>>
> >>>>> 3. I am also wondering if there is a way to calculate the dispersion
> >>>>> parameter or not?
> >>>>>
> >>>>> Anyone has some tips? Any suggestions will be greatly appreciated.
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Yue Yu
> >>>>>
> >>>>> _______________________________________________
> >>>>> R-sig-mixed-models at r-project.org mailing list
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> R-sig-mixed-models at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>
> >>
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> >