[R-sig-ME] lme4::mcmcsamp + coda::HPDinterval

Wed Apr 2 00:58:55 CEST 2008

On Tue, Apr 1, 2008 at 3:22 PM, Sundar Dorai-Raj
<sundar.dorai-raj at pdf.com> wrote:
> Hi, Prof. Bates,
>
>  Douglas Bates said the following on 4/1/2008 11:40 AM:
>
>  > This is a known problem with MCMC sampling of
>   > variance components.
>
>  So, does this mean the confidence interval is in question, and not the
>  point estimate?

I think I would characterize it as a problem with the mixing of the
Markov chain.  Another way of looking at it is that it is caused by an
improper posterior distribution for the parameter that is getting
stuck.  Frequently we use an improper prior distribution because the
likelihood will dominate the prior and we end up with a proper
posterior distribution (or close enough to being proper that we don't
encounter problems).  However, if the likelihood flattens out so that
it doesn't dominate the prior and it does this in a place where it is
not much, much smaller than the likelihood at the estimate, then we
have an improper posterior and a non-negligible probability of ending
up on the flat patch.  That's when we get problems.

> Would Wald intervals not be appropriate either? Or
>  profile likelihood intervals? (I know neither are available in lme4 but
>  I think could hack it if necessary.)

That's probably the best approach, although things may get touchy at
the boundary of the parameter space.  I must leave for a meeting now
but tomorrow morning I should be able to draft some code to show how
one would evaluate the profiled likelihood.  Basically you do the
following

# Set the value of the ST parameters (the two relative standard
deviations for this model)
# This also updates the scaled model matrix A.
.Call("mer_ST_setPars", fm, pars, PACKAGE = "lme4")
# Update the sparse Cholesky factor
.Call("mer_update_L", fm, PACKAGE = "lme4")
# Update the other parts of the Cholesky factor
.Call("mer_update_RX", fm, PACKAGE = "lme4")
# extract the deviance or REML deviance
dev  <- fm at deviance["ML"]  # or "REML"

If you want to get sophisticated you can profile one of the ST
parameters with respect to the other.

>  I've updated the lme4 package have started testing it. Thanks also for
>  the Gelman reference. I've requested a copy and will take a look.
>
>  Thanks,
>
>  --sundar
>
>  Douglas Bates said the following on 4/1/2008 11:40 AM:
>
>
> > May I suggest that you repeat the experiment in the development
>  > version of the lme4 package?  In that version the HPDinterval function
>  > has been moved to lme4 and it is no longer necessary to attach the
>  > coda package.
>  >
>  > I think it is easier to see what is happening when you use that
>  > version of the package because you can use the xyplot method to
>  > examine the evolution of the sampler.  I enclose a modified version of
>  > your code.  Running this version produces the enclosed plot.  You will
>  > see that the (relative) standard deviation of A (labelled 'ST2') gets
>  > stuck near zero.  This is a known problem with MCMC sampling of
>  > variance components.  The prior distribution corresponds to a "locally
>  > constant" uninformative prior on log(sigma_A).  As long as the
>  > likelihood at sigma = zero is sufficiently small to prevent the MCMC
>  > sampler getting near there the sampler proceeds happily.  However, if
>  > the likelihood is not sufficiently small then the MCMC sampler may
>  > wander into the "sigma near zero" region where the posterior density
>  > of log(sigma) is essentially flat and it gets stuck there.  The recent
>  > paper by Gelman et al. (JCGS, 2008) provides a suggestion of avoiding
>  > this problem by overparameterizing the model for the MCMC sampler but
>  > I haven't yet implemented.
>  >
>  >
>  >
>  > On Tue, Apr 1, 2008 at 12:50 PM, Sundar Dorai-Raj
>  > <sundar.dorai-raj at pdf.com> wrote:
>  >> Hi, lmer+coda Users,
>  >>
>  >>  (reproducible code at end)
>  >>
>  >>  I have a question regarding confidence intervals on the random effects.
>  >>  The data I'm working with is a highly unbalanced, nested design with two
>  >>  random effects (say, A/B), where the B variance is expected to be larger
>  >>  than A variance. I need to compute confidence bounds on the standard
>  >>  deviations of each effect (A and A:B). To do this, I use lme4::mcmcsamp
>  >>  with coda::HPDinterval. As in,
>  >>
>  >>  f <- lmer(...)
>  >>  m <- mcmcsamp(f, n = 1000)
>  >>  (s <- sqrt(exp(HPDinterval(m))))
>  >>
>  >>  However, one of the point estimates falls below the lower bound of the
>  >>  confidence interval. My guess is that this is due to the correlation
>  >>  between A and A:B (due to imbalance? relative magnitude of A vs. A:B?)
>  >>  leading to an unstable model fit. Are the point estimates completely
>  >>  untrustworthy? In this case, should I simply remove the offending random
>  >>  effect and refit? Is there a reference that describes situations such as
>  >>  these (point estimates outside the C.I.)?
>  >>
>  >>  Also, this may be an example where the nlme:::intervals.lme function
>  >>  produces intervals that are nonsensical (at least to me) if you change
>  >>  sd.A below to, say, 0.5.
>  >>
>  >>  Thanks,
>  >>
>  >>  --sundar
>  >>
>  >>  <code>
>  >>  set.seed(2)
>  >>  ## simulate data
>  >>  A <- factor(rep(1:10, each = 4))
>  >>  n <- length(A)
>  >>  B <- factor(rep(1:2, 20))
>  >>  sd.A <- 5
>  >>  sd.B <- 10
>  >>  sd.e <- 0.5
>  >>  rA <- rnorm(nlevels(A), 0, sd.A)
>  >>  rB <- rnorm(nlevels(A:B), 0, sd.B)
>  >>  e <- rnorm(n, 0, sd.e)
>  >>  y <- 0.5 + rA[A] + rB[A:B] + e
>  >>  ## create unbalance
>  >>  out <- seq(length(y)) %in% sample(length(y), 20)
>  >>
>  >>  ## fit model
>  >>  library(lme4)
>  >>  library(coda)
>  >>  f <- lmer2(y ~ (1 | A) + (1 | A:B), subset = !out)
>  >>  m <- mcmcsamp(f, 1000)
>  >>  v <- VarCorr(f)
>  >>  s.lmer <- cbind(sqrt(exp(HPDinterval(m[, -1]))),
>  >>                  est. = c(attr(v, "sc"), sqrt(sapply(v, as.matrix))),
>  >>                  true = c(sd.e, sd.B, sd.A))
>  >>  s.lmer <- s.lmer[, c("lower", "est.", "upper", "true")]
>  >>  rownames(s.lmer) <- c("sigma", "A:B", "A")
>  >>  print(zapsmall(s.lmer), digits = 4)
>  >>  </code>
>  >>
>  >>  _______________________________________________
>  >>  R-sig-mixed-models at r-project.org mailing list
>  >>  https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>  >>
>  >
>