Göran Broström gb at stat.umu.se
Wed Jun 9 15:38:30 CEST 2004

```On Tue, Jun 08, 2004 at 08:32:24AM -0700, Spencer Graves wrote:
> Hi, Doug:
>
>      Thanks.  I'll try the things you suggests.  The observed
> proportions ranged from roughly 0.2 to 0.8 in 100 binomial random
> samples where sigma is at most 0.05.  Jim Lindsey's "glmm" does
> Gauss-Hermite quadrature, but I don't know if it bothers with the
> adaptive step.  With it, I've seen estimates of the variance component
> ranging from 0.4 to 0.7 or so.  Since I simulated normal 0 standard
> deviation of 1, the algorithm was clearly underestimating what was
> simulated.  My next step, I think, is to program adaptive Gauss-Hermite
> quadrature for something closer to my real problem (as you just
> suggested), and see what I get.
[...]

> Douglas Bates wrote:
>
> >Spencer Graves <spencer.graves at pdf.com> writes:
> >
[...]
> >>	  Consider the following:
> >>
> >>> set.seed(1); N <- 10
> >>> z <- rnorm(N)
> >>> pz <- inv.logit(z)
> >>> DF <- data.frame(z=z, pz=pz, y=rbinom(N, 100, pz)/100, n=100,
> >>> smpl=factor(1:N))
> >>
> >>> GLMM(y~1, family=binomial, data=DF, random=~1|smpl, weights=n)

I think the "weights=n" is the problem, i.e., you summarize Bernoulli
data to Bin(100, p) data, and that gives a completely different estimate of
the variance of the random effect. (This looks as an error in lme4 to me,
or am I missing something? Doug?) Really, the two ways of representing data
should give equivalent analyses, but it doesn't. The same phenomenon
appears in 'glm', i.e. without random effects, but only for the residual
sum of squares, df, and AIC.

The following is a small function calling lme4 twice, first as
before, and then with data 'wrapped up' into Bernoulli data. I also run
'glmmML' in the latter case (since glmmML only allows the Bernoulli
representation):

-------------------------------------------------------------------
sim <- function(N = 10, grpSize = 10, std = 1){
require(glmmML)
require(lme4)

set.seed(1)
z <- rnorm(N, mean = 0, sd = std)
#    pz <- inv.logit(z); is identical to (in 'stats')
pz <- plogis(z)
Y <- rbinom(N, grpSize, pz)

## 'Summary' data frame:

DF <- data.frame(z = z, pz = pz,
y = Y / grpSize,
n = grpSize,
smpl = factor(1:N))

fit1 <- GLMM(y~1, family = binomial, data = DF,
random = ~1|smpl, weights = n)
##fit1 <- glm(y~1, family = binomial, data = DF, weights = n)

## 'Individual' data frame:

n <- N * grpSize
z <- rep(z, rep(grpSize, N))
pz <- rep(pz, rep(grpSize, N))

y <- numeric(n)
for (i in 1:N) y[((i - 1) * grpSize + 1):((i-1)*grpSize + Y[i])] <- 1
smpl <- as.factor(rep(1:N, rep(grpSize, N)))

bigDF <- data.frame(z = z, pz = pz, y = y,
smpl = smpl)
fit2 <- GLMM(y~1, family=binomial, data=bigDF, random=~1|smpl)
##fit2 <- glm(y~1, family=binomial, data=bigDF)

fitML <- glmmML(y ~ 1, family = binomial,
data = bigDF,
cluster = bigDF\$smpl)

return(list(fit1 = fit1,
fit2 = fit2,
fitML = fitML
)
)
}
----------------------------------------------------------------

Output:

\$fit1
Generalized Linear Mixed Model

Family: binomial family with logit link
Fixed: y ~ 1
Data: DF
log-likelihood:  -11.59919
Random effects:
Groups Name        Variance   Std.Dev.
smpl   (Intercept) 3.0384e-10 1.7431e-05

Estimated scale (compare to 1)  1.391217

Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.48955    0.20602  2.3762  0.01749

Number of Observations: 10
Number of Groups: 10

\$fit2
Generalized Linear Mixed Model

Family: binomial family with logit link
Fixed: y ~ 1
Data: bigDF
log-likelihood:  -64.8084
Random effects:
Groups Name        Variance Std.Dev.
smpl   (Intercept) 0.58353  0.7639

Estimated scale (compare to 1)  0.9563351

Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.52811    0.32286  1.6357   0.1019

Number of Observations: 100
Number of Groups: 10

\$fitML

Call:  glmmML(formula = y ~ 1, data = bigDF,
cluster = bigDF\$smpl, family = binomial)

coef se(coef)     z Pr(>|z|)
(Intercept) 0.5587   0.3313 1.686   0.0917

Standard deviation in mixing distribution:  0.764
Std. Error:                                 0.3655

Residual deviance: 129.5 on 98 degrees of freedom       AIC: 133.5
---------------
Note the big difference in estimated random effect 'sd' in the two lme4
runs! Note further how close to each other the corresponding estimates for
the second run of lme4 and of glmmML are.

[...]

Göran
--
Göran Broström                    tel: +46 90 786 5223
Department of Statistics          fax: +46 90 786 6614
Umeå University                   http://www.stat.umu.se/egna/gb/
SE-90187 Umeå, Sweden             e-mail: gb at stat.umu.se

```