[R] Variance component estimation in glmmPQL
nblarson
nblarson at gmail.com
Tue Nov 27 15:43:11 CET 2012
Hi all,
I've been attempting to fit a logistic glmm using glmmPQL in order to
estimate variance components for a score test, where the model is of the
form logit(mu) = X*a+ Z1*b1 + Z2*b2. Z1 and Z2 are actually reduced rank
square root matrices of the assumed covariance structure (up to a constant)
of random effects c1 and c2, respectively, such that b1 ~ N(0,sig.1^2*I) and
c1 ~ N(0,sig.1^2*K1) , where K1 = Z1*t(Z1), and c1 = Z1*b1.
The model form I've been using is just the following:
m<-glmmPQL(y~1,random=list(f=pdBlocked(list(pdIdnot(~Z.1-1),pdIdnot(~Z.2-1))))
,family=binomial(link="logit"))
I've been extracting the variance components using VarCorr(), but I've
noticed that the reported variances associated with my random effects are
not even close to the values I get if I evaluate their variances empirically
(eg var(random.effects(m.12)). I know that's not how they're actually
estimated, but there may be a whole order of magnitude difference in the
values.
Below is an example under R 2.14 on a linux machine:
library(MASS)
library(mgcv)
library(boot)
set.seed(1234)
G.1<-matrix(rnorm(5000,0,0.25),nrow=100)
G.2<-matrix(rnorm(5000,0,0.25),nrow=100)
K.1<-G.1%*%t(G.1)
K.2<-G.2%*%t(G.2)
Z.1<-mroot(K.1,method="svd")
Z.2<-mroot(K.2,method="svd")
b.1<-matrix(rnorm(ncol(Z.1),0,0.25),ncol=1)
b.2<-matrix(rnorm(ncol(Z.2),0,0.50),ncol=1)
p<-inv.logit(Z.1%*%b.1+Z.2%*%b.2)
y<-rbinom(100,1,p)
f<-rep(1,100)
m.fit<-glmmPQL(y~1,random=list(f=pdBlocked(list(pdIdent(~Z.1-1),pdIdent(~Z.2-1)))),
family=binomial(link="logit"))
VarCorr(m.fit)
var(as.numeric(random.effects(m.fit))[1:ncol(Z.1)])
var(as.numeric(random.effects(m.fit))[-c(1:ncol(Z.1))])
>From the above, VarCorr() results in variance component estimates of sig.1^2
= 0.444 and sig.2^2 = 0.2778, whereas the empirical estimates are sig.1^2 =
0.2060 and sig.1^2 = 0.097. I know variance component estimation in general
is a little shaky, but my simulations suggest that VarCorr() is extracting
values that are way too large on a consistent basis.
I'm largely assuming I'm misinterpreting something here, just can't figure
out what.
-Nick
--
View this message in context: http://r.789695.n4.nabble.com/Variance-component-estimation-in-glmmPQL-tp4650964.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list