[R-sig-ME] residuals in logistic regression model

Tue Sep 15 17:21:57 CEST 2015

Dear Martin,

You need to do some reading on residuals in the context of a
generalised linear model. There are not the same as those of a linear
model. Have a look at the formula of a glm. There is no residual terms
like there is one with a linear model.

Note that there are different ways to calculate residuals on a glm.
See ?residuals.glm

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey

2015-09-11 22:41 GMT+02:00 Martin Hecht <martin.hecht op hu-berlin.de>:
>
> Hi,
>
> I have some questions concerning the residuals in a logistic regression
> model ( family = binomial(link = "logit") ).
> Basically I am trying to reproduce the fixed error variance of 3.29 from
> simulated data (see code below).
> My questions are:
>
> [1] What is the difference between the residuals obtained with resid() vs.
> m$residuals in glm() ?
> [2] Why doesn't any of these two residual variances equal the original value
> of 3.29?
> [3] How can the residuals be transformed to be on the "original" scale (with
> their variance being 3.29)?
> [4] The fitted() values seem to be probabilities, is that right?
> [5] Why are the fitted() values and the residuals not on the same
> scale/metric?
>
> thank you very much,
> Martin
>
>
> # install.packages("gtools")
> library(gtools) # for inv.logit
> # install.packages("lme4")
> library(lme4)
>
> set.seed(1234)
>
> ### data generation
>
> # standard logistic errors
> eps <- rlogis ( n <- 100000 , 0 , 1 )
>
> # variance of errors is 3.29
> var ( eps )
> # [1] 3.293174
>
> # regression without predictors
> # latent y equals errors
> ylat <- 0 + eps
>
> # probabilities using inverse logit function
> probs <- inv.logit ( ylat )
>
> # generation of responses from bernoulli distribution
> resp <- sapply ( probs , function ( prob ) rbinom ( 1, 1, prob ) )
>
>
> ### glm model
>
> # logistic regression
> m <- glm ( resp ~ 1 , family = binomial(link = "logit") )
>
> # variance of resid()
> var ( resid ( m ) )
> # [1] 1.386302
>
> # variance of residuals in the results object
> var ( m$residuals )
> # [1] 4.000061
>
>
> ### glmer model
>
> # random groups to specify a "level 2" random effect
> # (just for the purpose to be able to run glmer)
> gr <- sample ( 1:10 , n , replace = TRUE )
> d <- data.frame ( resp , gr )
>
> # model
> m2 <- glmer ( resp ~ 1 + (1|gr) , d , family = binomial(link = "logit") )
>
> # variance of resid()
> var ( resid ( m2 ) )
> # [1] 1.386302
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models