[R-sig-ME] extracting gradient and hessian matrix from lme4

Joshua Wiley jwiley.psych at gmail.com
Sun Apr 1 22:02:10 CEST 2012


Thanks to both of you.  Really.  Your time both developing software
and supporting it is amazing and greatly appreciated.

On Sun, Apr 1, 2012 at 10:30 AM, Ben Bolker <bbolker at gmail.com> wrote:
> On 12-04-01 03:50 AM, Douglas Bates wrote:
>> On Fri, Mar 30, 2012 at 3:44 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
>>> Hi Ben,
>>>
>>> Many thanks for the help.  I tried your suggestion out and it seemed
>>> to work (and I learned a bit about lme4 in the process :)
>>>
>>> library(lme4)
>>> fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
>>> fm1Fun <- update(fm1,devFunOnly=TRUE)
>>> library(numDeriv)
>>> fm1_thpar <- getME(fm1,"theta")
>>> h <- hessian(fm1Fun, fm1_thpar)
>>> g <- grad(fm1Fun, fm1_thpar)
>>>
>>> which I can use (I think) to get standard errors of the variance parameters.
>>>
>>> library(MASS)
>>> sqrt(diag(ginv(h)))
>>>
>>> which plug into a longer formula that attempts to test the
>>> significance of indirect effects.  Given that the variance parameters
>>> are not normally distributed, my hunch is that even though both fixed
>>> and random effects (and their variances/standard errors) are being
>>> built into the mediation test, it is probably not well-behaved either,
>>> but it is nice to be able to try to replicate models in the article.
>>> Even if they are not perfectly accurate, I am hoping I can use them as
>>> a sanity check for when I play with some mcmc and bootstrapping.
>>
>> I have been travelling and haven't tracked messages on the list
>> closely so I might have missed something here.  What do you mean by
>> "the variance parameters"?  The theta parameters aren't variances and
>> covariances.  They are values from the relative covariance factor.
>> For the model you fit the first element of theta is the standard
>> deviation of the intercept random effects divided by the standard
>> deviation of the per-observation noise.  The second and third elements
>> only make sense in terms of the Cholesky factor of the relative
>> variance-covariance matrix.

Okay, good to know.  Thank you for pointing this out.  I have a copy
of your book (Pinheiro & Bates) and the draft chapters for lme4 on
R-forge.  If there are any other references you think would help me
gain a good understanding of how lme4 works, I would happily get and
read them.

>>
>> This choice is not arbitrary.  Although we are accustomed to thinking
>> in terms of variances and covariances or in terms of standard
>> deviations and correlations, these are difficult scales on which to
>> optimize because the constraints on these values are complicated
>> nonlinear constraints.

That makes sense.

>> So as long as you recognize that the "variance parameters" aren't
>> variances or covariances you should be okay.
>
>  The warning is worthwhile.  I haven't taken enough time to figure out
> the context (Appendix of
>
> http://www.unc.edu/~dbauer/manuscripts/bauer-preacher-gil-PM-2006.pdf
>
> as stated), but to me it actually seems that the context there is
> actually using the RE variances as plug-in estimates (i.e. ignoring
> their uncertainty) ... ??

Here is more context and an example but I certainly do not expect
anyone to read through all of this and try it.  You have done more
than enough already.

Perhaps an example closer to a real use case would be helpful.  In
psychology, it is common to think of mediation, where one variable
(say x) is associated with an outcome of interest (say y), but the
cause link is believed to be through a third variable, the mediator
(say m).  To give a hypothetical economics example, suppose taxes are
lowered (x), which causes business and people to be more productive
and earn more (m), which causes overall government revenue (y) to
increase, even though the tax rate is lower.

Sometimes diagrammed:
x -> m -> y

causality is often the theory, but is not necessarily assessed so set
that aside for the moment.  The interest is in the average indirect
effect of x on y through the variable m.  Again in psychology, this is
usually assessed as the product of coefficients of x predicting m and
m predicting y, controlling for x.  In models with mixed effects, this
complicated by the fact that one or both of these effects may be
random, not fixed.

To accurately estimate all the necessary parameters, it is necessary
to have them in a single model.  This can be done in a structural
equation model framework with a series of equations or in mixed
effects models by 'stacking' the two dependent variables (y and m) and
using indicator variables.  The paper referenced presented a technique
for estimating the indirect effect (say g):

g = ab + cov(a, b)

where a is the average RE of x predicting m, and b is the average RE
of m predicting y.  The appendix of the paper that I referenced uses
covariance matrix of the REs, gradient, and hessian to come up with an
estimate of

Var(g) to construct an assymptotic test like:

g/sqrt(Var(g))

in order to test whether the average indirect RE is significantly
different from zero.  I have my doubts about how accurate this
approach is given that products of coefficients tend not to be
normally distributed, and I do want to try other methods for creating
a confidence interval or perhaps move to a bayesian framework and use
a credible interval.  But for first steps, I figured try to replicate
the article (which was done in SAS) in R, and then go on to try
additional techniques.  Here is a dataset online and R code for the
type of model I am trying to run.  I have been able to replicate the
paper's point estimate of the indirect effect (the authors provide
sample data and SAS code), in nlme, and fairly close in lme4
(complicated by difficulties for me to get a heterogenous residual
variance structure for the stacked outcomes).

##############################################
require(foreign)
d <- read.dta("http://www.ats.ucla.edu/stat/data/ml_sim.dta")

require(reshape2)
d$fid <- 1:nrow(d)
stacked <- melt(d, id.vars = c("fid", "id", "x", "m"),
  measure.vars = c("y", "m"), value.name = "z")
stacked <- within(stacked, {
    sy <- as.integer(variable == "y")
    sm <- as.integer(variable == "m")
})

require(lme4)
mm <- lmer(z ~ 0 + sm + sm:x + sy + sy:m + sy:x +
              (0 + sm + sm:x + sy + sy:m + sy:x | id) +
              (0 + sm | fid), data = stacked)
summary(mm)

## effect of interest is x's effect on y through m
## i.e., sm:x * sy:m + cov_{sm:x, sy:m} because they are REs
## (0 + sm | fid) is an attempt to account for differential
## variability in the m versus y outcomes, something
## (though far from exactly) like this in nlme:

require(nlme)
mm.alt <- lme(z ~ 0 + sm + sm:x + sy + sy:m + sy:x, data = stacked,
       random = ~ 0 + sm + sm:x + sy + sy:m + sy:x | id,
       weights = varIdent(form = ~ 1 | variable))
summary(mm.alt)
##############################################

Cheers,

Josh

[snip]




More information about the R-sig-mixed-models mailing list