[R] alternative to logistic regression

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Nov 16 16:28:27 CET 2007


On Fri, 16 Nov 2007, Terry Therneau wrote:

> You can fit a linear probability model with glm and a bit of arm twisting.
> First, make your own copy of the binomial function:
>   > dump('binomial', file='mybinom.R')
>
> Edit it to change the function name to "mybinom" (or anything else you
> like), and to add 'identity' to the list of okLinks.

Hmm ... I think you are generalizing from another R-like system.

binomial("identity") works out of the box, and R's glm() will 
backtrack if it encounters a fitted value < 0 or > 1.  Now, the 
back-tracking can get stuck but it often does a reasonable job.

Examples:

set.seed(1)
x <- seq(0, 1, length=10)
y <- rbinom(10, 10, 0.1 + 0.8*x)
glm(cbind(y, 10-y)  ~ x, binomial("identity"), start=c(0.5,0))

works.  But variants, e.g.

y <- rbinom(10, 10, 0.1 + 0.9*x)

backtrack and give warnings.

What does not work is binomial(identity), unlike binomial(logit).

>  Source the file back in, and use mybiom('identity') to fit the model.
>
>  Caveat Emptor: This will work just fine if all of your data is far enough away
> from 0 or 1.  But if the predicted value for any data point, at any time during
> the iteration, is <=0 or >=1 then the calculation of the log-likelihood will
> involve an NA and the glm routine will fail.  NAs produced deep inside a
> computation tend to produce unhelpful and/or misleading error messages
> (sometimes the NA can propogate for some ways through the code before creating a
> failure).  You can also get the counterintuitive result that models with few or
> poor covariates work (all the predictions are near to the mean), but more useful
> covariates cause the model to fail.
>
>  Linear links for both the binomial and Poisson are a challenging computational
> problem.  But they are becoming important in medical work due to recent
> appreciation that the absolute risk attached to a variable is often more
> relevant than the relative risk (odds ratio or risk ratio).
>
>  	Terry Therneau
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list