[R] GLMM(..., family=binomial(link="cloglog"))?
Göran Broström
gb at stat.umu.se
Wed Jun 9 15:38:30 CEST 2004
On Tue, Jun 08, 2004 at 08:32:24AM -0700, Spencer Graves wrote:
> Hi, Doug:
>
> Thanks. I'll try the things you suggests. The observed
> proportions ranged from roughly 0.2 to 0.8 in 100 binomial random
> samples where sigma is at most 0.05. Jim Lindsey's "glmm" does
> Gauss-Hermite quadrature, but I don't know if it bothers with the
> adaptive step. With it, I've seen estimates of the variance component
> ranging from 0.4 to 0.7 or so. Since I simulated normal 0 standard
> deviation of 1, the algorithm was clearly underestimating what was
> simulated. My next step, I think, is to program adaptive Gauss-Hermite
> quadrature for something closer to my real problem (as you just
> suggested), and see what I get.
[...]
> Douglas Bates wrote:
>
> >Spencer Graves <spencer.graves at pdf.com> writes:
> >
[...]
> >> Consider the following:
> >>
> >>> set.seed(1); N <- 10
> >>> z <- rnorm(N)
> >>> pz <- inv.logit(z)
> >>> DF <- data.frame(z=z, pz=pz, y=rbinom(N, 100, pz)/100, n=100,
> >>> smpl=factor(1:N))
> >>
> >>> GLMM(y~1, family=binomial, data=DF, random=~1|smpl, weights=n)
I think the "weights=n" is the problem, i.e., you summarize Bernoulli
data to Bin(100, p) data, and that gives a completely different estimate of
the variance of the random effect. (This looks as an error in lme4 to me,
or am I missing something? Doug?) Really, the two ways of representing data
should give equivalent analyses, but it doesn't. The same phenomenon
appears in 'glm', i.e. without random effects, but only for the residual
sum of squares, df, and AIC.
The following is a small function calling lme4 twice, first as
before, and then with data 'wrapped up' into Bernoulli data. I also run
'glmmML' in the latter case (since glmmML only allows the Bernoulli
representation):
-------------------------------------------------------------------
sim <- function(N = 10, grpSize = 10, std = 1){
require(glmmML)
require(lme4)
set.seed(1)
z <- rnorm(N, mean = 0, sd = std)
# pz <- inv.logit(z); is identical to (in 'stats')
pz <- plogis(z)
Y <- rbinom(N, grpSize, pz)
## 'Summary' data frame:
DF <- data.frame(z = z, pz = pz,
y = Y / grpSize,
n = grpSize,
smpl = factor(1:N))
fit1 <- GLMM(y~1, family = binomial, data = DF,
random = ~1|smpl, weights = n)
##fit1 <- glm(y~1, family = binomial, data = DF, weights = n)
## 'Individual' data frame:
n <- N * grpSize
z <- rep(z, rep(grpSize, N))
pz <- rep(pz, rep(grpSize, N))
y <- numeric(n)
for (i in 1:N) y[((i - 1) * grpSize + 1):((i-1)*grpSize + Y[i])] <- 1
smpl <- as.factor(rep(1:N, rep(grpSize, N)))
bigDF <- data.frame(z = z, pz = pz, y = y,
smpl = smpl)
fit2 <- GLMM(y~1, family=binomial, data=bigDF, random=~1|smpl)
##fit2 <- glm(y~1, family=binomial, data=bigDF)
fitML <- glmmML(y ~ 1, family = binomial,
data = bigDF,
cluster = bigDF$smpl)
return(list(fit1 = fit1,
fit2 = fit2,
fitML = fitML
)
)
}
----------------------------------------------------------------
Output:
$fit1
Generalized Linear Mixed Model
Family: binomial family with logit link
Fixed: y ~ 1
Data: DF
log-likelihood: -11.59919
Random effects:
Groups Name Variance Std.Dev.
smpl (Intercept) 3.0384e-10 1.7431e-05
Estimated scale (compare to 1) 1.391217
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.48955 0.20602 2.3762 0.01749
Number of Observations: 10
Number of Groups: 10
$fit2
Generalized Linear Mixed Model
Family: binomial family with logit link
Fixed: y ~ 1
Data: bigDF
log-likelihood: -64.8084
Random effects:
Groups Name Variance Std.Dev.
smpl (Intercept) 0.58353 0.7639
Estimated scale (compare to 1) 0.9563351
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.52811 0.32286 1.6357 0.1019
Number of Observations: 100
Number of Groups: 10
$fitML
Call: glmmML(formula = y ~ 1, data = bigDF,
cluster = bigDF$smpl, family = binomial)
coef se(coef) z Pr(>|z|)
(Intercept) 0.5587 0.3313 1.686 0.0917
Standard deviation in mixing distribution: 0.764
Std. Error: 0.3655
Residual deviance: 129.5 on 98 degrees of freedom AIC: 133.5
---------------
Note the big difference in estimated random effect 'sd' in the two lme4
runs! Note further how close to each other the corresponding estimates for
the second run of lme4 and of glmmML are.
[...]
Göran
--
Göran Broström tel: +46 90 786 5223
Department of Statistics fax: +46 90 786 6614
Umeå University http://www.stat.umu.se/egna/gb/
SE-90187 Umeå, Sweden e-mail: gb at stat.umu.se
More information about the R-help
mailing list