[R-sig-ME] residuals in logistic regression model
Martin Hecht
martin.hecht at hu-berlin.de
Fri Sep 11 22:41:35 CEST 2015
Hi,
I have some questions concerning the residuals in a logistic regression
model ( family = binomial(link = "logit") ).
Basically I am trying to reproduce the fixed error variance of 3.29 from
simulated data (see code below).
My questions are:
[1] What is the difference between the residuals obtained with resid()
vs. m$residuals in glm() ?
[2] Why doesn't any of these two residual variances equal the original
value of 3.29?
[3] How can the residuals be transformed to be on the "original" scale
(with their variance being 3.29)?
[4] The fitted() values seem to be probabilities, is that right?
[5] Why are the fitted() values and the residuals not on the same
scale/metric?
thank you very much,
Martin
# install.packages("gtools")
library(gtools) # for inv.logit
# install.packages("lme4")
library(lme4)
set.seed(1234)
### data generation
# standard logistic errors
eps <- rlogis ( n <- 100000 , 0 , 1 )
# variance of errors is 3.29
var ( eps )
# [1] 3.293174
# regression without predictors
# latent y equals errors
ylat <- 0 + eps
# probabilities using inverse logit function
probs <- inv.logit ( ylat )
# generation of responses from bernoulli distribution
resp <- sapply ( probs , function ( prob ) rbinom ( 1, 1, prob ) )
### glm model
# logistic regression
m <- glm ( resp ~ 1 , family = binomial(link = "logit") )
# variance of resid()
var ( resid ( m ) )
# [1] 1.386302
# variance of residuals in the results object
var ( m$residuals )
# [1] 4.000061
### glmer model
# random groups to specify a "level 2" random effect
# (just for the purpose to be able to run glmer)
gr <- sample ( 1:10 , n , replace = TRUE )
d <- data.frame ( resp , gr )
# model
m2 <- glmer ( resp ~ 1 + (1|gr) , d , family = binomial(link = "logit") )
# variance of resid()
var ( resid ( m2 ) )
# [1] 1.386302
More information about the R-sig-mixed-models
mailing list