[R-sig-ME] residuals in logistic regression model

Fri Sep 11 22:41:35 CEST 2015

Hi,

I have some questions concerning the residuals in a logistic regression 
model ( family = binomial(link = "logit") ).
Basically I am trying to reproduce the fixed error variance of 3.29 from 
simulated data (see code below).
My questions are:

[1] What is the difference between the residuals obtained with resid() 
vs. m$residuals in glm() ?
[2] Why doesn't any of these two residual variances equal the original 
value of 3.29?
[3] How can the residuals be transformed to be on the "original" scale 
(with their variance being 3.29)?
[4] The fitted() values seem to be probabilities, is that right?
[5] Why are the fitted() values and the residuals not on the same 
scale/metric?

thank you very much,
Martin

# install.packages("gtools")
library(gtools) # for inv.logit
# install.packages("lme4")
library(lme4)

set.seed(1234)

### data generation

# standard logistic errors
eps <- rlogis ( n <- 100000 , 0 , 1 )

# variance of errors is 3.29
var ( eps )
# [1] 3.293174

# regression without predictors
# latent y equals errors
ylat <- 0 + eps

# probabilities using inverse logit function
probs <- inv.logit ( ylat )

# generation of responses from bernoulli distribution
resp <- sapply ( probs , function ( prob ) rbinom ( 1, 1, prob ) )

### glm model

# logistic regression
m <- glm ( resp ~ 1 , family = binomial(link = "logit") )

# variance of resid()
var ( resid ( m ) )
# [1] 1.386302

# variance of residuals in the results object
var ( m$residuals )
# [1] 4.000061

### glmer model

# random groups to specify a "level 2" random effect
# (just for the purpose to be able to run glmer)
gr <- sample ( 1:10 , n , replace = TRUE )
d <- data.frame ( resp , gr )

# model
m2 <- glmer ( resp ~ 1 + (1|gr) , d , family = binomial(link = "logit") )

# variance of resid()
var ( resid ( m2 ) )
# [1] 1.386302