[R-sig-ME] recover simulated individual-level random effects with glmer?

Ben Bolker bbolker at gmail.com
Mon Jun 22 20:47:40 CEST 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15-06-22 11:36 AM, Ian Carroll wrote:
> I raised this question yesterday on Cross Validated ( 
> stats.stackexchange.com/q/158043/43122), but now realize this might
> be the more appropriate forum for help. Please excuse the
> cross-post if you track both.
> 
> I simulated count data with an observation level random effect,
> then fit a Poisson-family GLMM using glmer (lme4 version 1.1-7).
> The fitted random intercepts show a strange pattern when compared
> to the simulated data, and produce a kinked QQ plot. If this is
> expected, why is it okay? If not, what am I doing wrong?
> 
> Simulated data and fitting:
>> n <- 1000 df <- data.frame(obs=factor(1:n)) df$x <- rnorm(n) df$r
>> <- rnorm(n, sd=3) df$y <- rpois(n=n, lambda=exp(2*df$x + df$r)) 
>> glmer.fit <- glmer(y ~ x + (1|obs), family='poisson', data=df)
> 
> Questioned results:
>> df$r.est <- ranef(glmer.fit)$obs[ , '(Intercept)'] plot(df$r,
>> df$r.est) qqnorm(df$r.est)
> 
> My results (posted on CV) are two overlapping clusters of points:
> in one r and r.est show the expected correlation, in the other
> there is no correlation. The normal QQ plot makes it look like
> different normal distributions are obtained for positive and
> negative intercepts.
> 
> Thanks, Ian


  I think this is inevitable.  If you think about it, the distribution
of the *conditional* modes of the observations with y=0 must be
different from that of the conditional modes of the obs. with y>0.
This is much easier to see with an even more trivial model that
doesn't include a covariate (see below ...)

  I don't think actually violates any of the assumptions we're using.
If you're worried about its effects on estimation & inference you
could do some simulation examples to see if it affects whatever it is
you're interested in (type I error rate, bias/variance of parameter
estimation, etc.)

library(lme4)
library(plyr) ## for mutate
set.seed(101)
n <- 1000
df <- data.frame(obs=factor(1:n),x=rnorm(n))
df <- mutate(df,
             r=rnorm(n, sd=3),
             y=rpois(n=n, lambda=exp(2*x + r)),
             y2 = rpois(n=n,lambda=exp(r)))


glmer.fit <- glmer(y ~ x + (1|obs), family='poisson', data=df)
glmer.fit2 <- update(glmer.fit,nAGQ=20)
glmer.fit3 <- update(glmer.fit, y2 ~ 1 + (1|obs))

## df$r.est <- ranef(glmer.fit2)$obs[ , '(Intercept)']
df$r.est <- ranef(glmer.fit3)$obs[ , '(Intercept)']
cvec <- c("#000000","#FF000080")
cvec2 <- cvec[as.numeric(df$y2==0)+1]
plot(df$r, df$r.est,col=cvec2)

qqnorm(df$r.est,col=cvec[as.numeric(df$y2==0)+1])

df$y2cat <- cut(df$y2,right=FALSE,include.lowest=TRUE,
                    breaks=c(0:6,Inf))
library("ggplot2"); theme_set(theme_bw())
ggplot(df,aes(r.est,fill=y))+
    geom_histogram(alpha=0.5,position="identity",
                   aes(y = ..density..))
ggplot(df,aes(r.est,fill=ycat))+
    geom_density(alpha=0.5,position="identity")



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJViFhMAAoJEOCV5YRblxUH8gMIAMiLYfW2C57AYTyrVjb660wg
uvAkJGQjZ9ORIdm/MqDRydKTuNV16r3WfwSPXWCdLDBJjpdDq2w3dO5CxsbYjOH+
q7L0NWdX0KRjb1oKF2muLbz38dvH6Z4ExjOmJlW3funuVzCVvzhIGV7N/iIMqp5D
yTnK2y/YeVySuO91MqBgbKNtH/mGTfH1iGMbmhrssF1y8EfeWrw6RF7/uti8vhWE
cvAUDtkz74Q1gfJVyeCRPVRj7ADRb6tN7FuWbqukfEoK5etqYkO5QRIv6BLR4qoU
Hs+zPCymno6JZZDpY/D1UG5HChyXJZjFN6Xj7e+gnPtAvBUa5UTIxfy+z1cblT4=
=Ygc1
-----END PGP SIGNATURE-----



More information about the R-sig-mixed-models mailing list