[R-sig-ME] Ben's Point about Centering and GLMM (was: Re: Low intercept estimate in a binomial glmm

Thu Apr 11 21:35:00 CEST 2013

Paul Johnson <pauljohn32 at ...> writes:

> 
> On Fri, Apr 5, 2013 at 1:24 AM, John Maindonald
> <john.maindonald at ...> wrote:
> > Surely it is an issue of how you define multi-collinearity.
> >
> I don't think so. The definition is the same, but multi-collinearity's
> effect is different for every point in the X space.  I mean, the
> elevation in variance estimates due to multi-collinearity depends on
> where you place the y axis. The point estimates that appear in
> regression output are different when you center because you move the y
> axis about by centering.  But if you fit in one spot, and then project
> the answer over to the other spot, the answer you get about slope,
> standard error, etc is all the same. In either model.
> 
> Centering appeals to many practitioners because it seems to give
> parameters with smaller standard errors, but its an illusion.
> Uncertainty about predictions is hour-glass shaped in the X space, and
> if you go into the middle, you have less uncertainty.
> 
> > Re-parameterisation may however give
> > parameters that are much more interpretable, with much
> > reduced correlations and standard errors   That is the
> > primary reason, if there is one, for doing it.
> >
> 
> I think that's a mistake, and have the examples in the rockchalk
> vignette to demonstrate it.  If you say "what is the slope when
> observed X = x", and "what is the uncertainty of your estimate when X
> = x?" all of these models give exactly the same answer.
> 
> But back to Ben's point about GLMM.  That's an eye opener.
> 
> I'd like to make a working example of the problem that centering
> affects estimates (beyond rounding error).  I need to know a test case
> that is likely to produce the effect mentioned before I can go any
> further.
> 
> pj
> --
> Paul E. Johnson
> Professor, Political Science      Assoc. Director
> 1541 Lilac Lane, Room 504      Center for Research Methods
> University of Kansas                 University of Kansas
> http://pj.freefaculty.org               http://quant.ku.edu

  I had to work a little harder than I expected, and it's more
extreme/has less effect than I would like, but in this example
stable lme4 gives a convergence warning and gives an intercept
that is a little bit off (although admittedly about 0.5%);
development lme4 actually handles it OK.

Messier data sets will probably behave worse than this (i.e. the
effect will be bigger).  For other/better examples, you might troll
through list archives to find people complaining about convergence
warnings ...  Maybe someone will come forward with an example.
(I used to have a problem if I didn't center the HEIGHT variable in the
Elston tick data example (see ?grouseticks in development lme4),
but things seem to have improved since the last time I tried
it a few years ago ...)

set.seed(101)
nblock <- 5
nrep <- 5
xr <- 4
mx <- 10000
sl <- 2
d <- expand.grid(x=mx+(-xr:xr),grp=LETTERS[1:nblock],rep=1:nrep)
u <- rnorm(nblock,sd=1)
d$eta <- with(d,1+(x-mx)*sl+u[grp])
d$y <- rpois(nrow(d),exp(d$eta))

ff <- function(model,corr=TRUE,shift=mx) {
    f <- fixef(model)
    if (corr) f + c(f[2]*shift,0) else f
}
library(lme4.0)
g1 <- glmer(y~x+(1|grp),data=d,family=poisson)
## false convergence warning
(f1 <- ff(g1))
g2 <- glmer(y~I(x-mx)+(1|grp),data=d,family=poisson)
(f2 <- ff(g2,FALSE))
detach("package:lme4.0",unload=TRUE)
library(lme4)
g3 <- glmer(y~x+(1|grp),data=d,family=poisson)
(f3 <- ff(g3))
g4 <- glmer(y~I(x-mx)+(1|grp),data=d,family=poisson)
(f4 <- ff(g4,FALSE))
cbind(f1,f2,f3,f4)

## stable-lme4, uncentered and centered;
## devel-lme4, uncentered and centered
##                   f1       f2       f3       f4
## (Intercept) 1.014301 1.019990 1.019982 1.019984
## x           1.999046 1.999055 1.999056 1.999056