Dear Paul,
Thanks for the interesting example.
In this case we do know the true ID effects (b) so we can inspect the
true and estimated ID effects:
> table(dat5$ID) -> tab5 # nr of obs per ID
> plot( b[1:89], ranef(m2)$ID[,1], xlab="true ID effect", ylab="estimated ID
effect", pch=ifelse(tab5==2,16,1), cex=2,
���� main="m2 of dat5" )
> abline(a=0,b=1,lty=4)
> legend("top",pch=c(16,1), legend=c("two obs","one obs"), pt.cex=2, ncol=2 )
This confirms that the random estimates for ID deviate more from their
true value if there is only 1 data point available than if there are 2
data points available. With more data points per ID it becomes easier to
separate ID (b) and residual (err) random effects. In other words, some
of the err variance is now considered as part of ID variance. Thus with
the incomplete data in dat5, the variance between ID is overestimated
(estimate 1.24, true 1.00), as illustrated in the plot. Conversely, the
err variance is underestimated (estimate 0.66, true 1.00).
