# [R] Extrapolating values from a glm fit

Gavin Simpson gavin.simpson at ucl.ac.uk
Thu Jan 27 13:16:55 CET 2011

```On Thu, 2011-01-27 at 00:10 -1000, Ahnate Lim wrote:
> Thank you for the clarification, this makes sense now! dose.p from
> MASS also does the job perfectly, in returning x values for specified
> proportions. I'm curious, if I use the other parameter in predict.glm
> binomial, what does this predict?

This is the predicted value(s) on the scale of the linear predictor, or

Recall that in the GLM the response and the linear predictor are related

g(y) = eta

logit(y) = eta

where eta is the linear predictor: beta_0 + beta_1 * x (in your
example).

beta_0 + beta_1 * x gives us the fitted value but on the untransformed
scale. This is the value given by predict() when type = "link" is used.
To get the predicted value on the response scale, we apply the inverse

y = g(eta)^-1

The inverse of the logit function is the inverse-logit:

logit(eta)^-1 = exp(eta) / (1 + exp(eta))

So in R, we can see the relationship through a few simple commands.
First, the prediction for x = 0.5 on the response scale:

> predict(mylogit, newdata = list(x = 0.5), type = "response")
1
0.8149848

Then we compute the same prediction but on the scale of the link
function:

> (p1 <- predict(mylogit, newdata = list(x = 0.5), type = "link"))
1
1.482732

to which we apply the inverse-logit function giving us the same value we
got earlier for type = "response":

> exp(p1) / (1 + exp(p1))
1
0.8149848

There is a function to do this for us, however, called plogis()

> plogis(p1)
1
0.8149848

One reason why you might want predictions on the scale of the link
function is for computation of confidence intervals using normal theory
(e.g. 2-sigma ~95% confidence intervals). On the response scale these
should be asymmetric and respect the scale of the response (bounding
etc), so you compute them on the link scale and apply the inverse of the
link function to get them on to the response scale.

HTH

G

>
>
>
> On Wed, Jan 26, 2011 at 11:41 PM, Gavin Simpson
> <gavin.simpson at ucl.ac.uk> wrote:
>         On Wed, 2011-01-26 at 19:25 -1000, Ahnate Lim wrote:
>         > Even when I try to predict y values from x, let's say I want
>         to predict y at
>         > x=0. Looking at the graph from the provided syntax, I would
>         expect y to be
>         > about 0.85. Is this right:
>         >
>         > predict(mylogit,newdata=as.data.frame(0),type="response")
>
>
>         Your original problem was the use of `newdata =
>         as.data.frame(0.5)`.
>         There are two problems here: i) if you don't name the input (x
>         = 0.5,
>         say) then you get a data frame with the name(s) "0.5":
>
>         > as.data.frame(0.5)
>
>          0.5
>         1 0.5
>
>         and ii) if you do name it, you still get a data frame with
>         name(s) "0.5"
>
>         > as.data.frame(x = 0.5)
>          0.5
>         1 0.5
>
>         In both cases, predict wants to find a variable with the name
>         `x` in the
>         object supplied to `newdata`. It finds `x` but your `x` in the
>         global
>         workspace, but warns because it knows that `newdata` was a
>         data frame
>         with a single row - so there was a mismatch and you likely
>         mistake.
>
>         In these cases, `data.frame()` is preferred to
>         `as.data.frame()`:
>
>         predict(mylogit, newdata = data.frame(x = 0), type =
>         "response")
>
>         or we can use a list, to save a few characters:
>
>         predict(mylogit, newdata = list(x = 0), type = "response")
>
>         which give:
>
>         > predict(mylogit, newdata = list(x = 0), type = "response")
>               1
>         0.813069
>         > predict(mylogit, newdata = data.frame(x = 0), type =
>         "response")
>               1
>         0.813069
>
>         In summary, use `data.frame()` or `list()` to create the
>         object passed
>         as `newdata` and make sure you give the component containing
>         the new
>         values a *name* that matches the predictor in the model
>         formula.
>
>         HTH
>
>
>         G
>
>         >
>         > # I get:
>         >
>         > Warning message:
>         > 'newdata' had 1 rows but variable(s) found have 34 rows
>         >
>         > # Why do I need 34 rows? So I try:
>         >
>         > v=vector()
>         > v[1:34]=0
>         > predict(mylogit,newdata=as.data.frame(v),type="response")
>         >
>         > # And I get a matrix of 34 values that appear to fit a
>         logistic function,
>         >
>         >
>         >
>         >
>         > On Wed, Jan 26, 2011 at 6:59 PM, David Winsemius
>         <dwinsemius at comcast.net>wrote:
>         >
>         > >
>         > > On Jan 26, 2011, at 10:52 PM, Ahnate Lim wrote:
>         > >
>         > >  Dear R-help,
>         > >>
>         > >> I have fitted a glm logistic function to dichotomous
>         forced choices
>         > >> responses varying according to time interval between two
>         stimulus. x
>         > >> values
>         > >> are time separation in miliseconds, and the y values are
>         proportion
>         > >> responses for one of the stimulus. Now I am trying to
>         extrapolate x values
>         > >> for the y value (proportion) at .25, .5, and .75. I have
>         tried several
>         > >> predict parameters, and they don't appear to be working.
>         Is this correct
>         > >> use
>         > >> and understanding of the predict function? It would be
>         nice to know the
>         > >> parameters for the glm best fit, but all I really need
>         are the
>         > >> extrapolated
>         > >> x values for those proportions. Thank you for your help.
>         Here is the code:
>         > >>
>         > >> x <-
>         > >> c(-283.9, -267.2, -250.5, -233.8, -217.1, -200.4, -183.7,
>         -167,
>         > >> -150.3, -133.6, -116.9, -100.2, -83.5, -66.8, -50.1,
>         -33.4, -16.7,
>         > >> 16.7, 33.4, 50.1, 66.8, 83.5, 100.2, 116.9, 133.6, 150.3,
>         167,
>         > >> 183.7, 200.4, 217.1, 233.8, 250.5, 267.2, 283.9)
>         > >>
>         > >> y <-
>         > >> c(0, 0.333333333333333, 0, 0, 0, 0, 0, 0, 0,
>         0.333333333333333,
>         > >> 0, 0.133333333333333, 0.238095238095238,
>         0.527777777777778,
>         > >> 0.566666666666667,
>         > >> 0.845238095238095, 0.55, 1, 0.888888888888889, 1, 1, 1,
>         1, 1,
>         > >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5)
>         > >>
>         > >> weight <-
>         > >> c(1, 3, 2, 5, 4, 4, 3, 5, 5, 4, 5, 11, 22, 11, 15, 16,
>         11, 7,
>         > >> 14, 10, 16, 19, 11, 5, 4, 5, 6, 9, 4, 2, 5, 5, 2, 2)
>         > >>
>         > >> mylogit <- glm(y~x,weights=weight, family = binomial)
>         > >>
>         > >> # now I try plotting the predicted value, and it looks
>         like a good fit,
>         > >> hopefully I can access what the glm is doing
>         > >>
>         > >> ypred <-
>         predict(mylogit,newdata=as.data.frame(x),type="response")
>         > >> plot(x, ypred,type="l")
>         > >> points(x,y)
>         > >>
>         > >> # so I try to predict the x value when y (proportion) is
>         at .5, but
>         > >> something is wrong..
>         > >>
>         > >
>         > > Right. Predict goes in the other direction ... x predicts
>         y.
>         > >
>         > > Perhaps if you created a function of y w.r.t. x and then
>         inverted it.
>         > >
>         > > ?approxfun  # and see the posting earlier this week
>         "Inverse Prediction
>         > > with splines" where it was demonstrated how to reverse the
>         roles of x and y.
>         > >
>         > >>
>         > >> predict(mylogit,newdata=as.data.frame(0.5))
>         > >>
>         > >>        [[alternative HTML version deleted]]
>         > >>
>         > >> ______________________________________________
>         > >> R-help at r-project.org mailing list
>         > >> https://stat.ethz.ch/mailman/listinfo/r-help
>         > >> http://www.R-project.org/posting-guide.html
>         > >> and provide commented, minimal, self-contained,
>         reproducible code.
>         > >>
>         > >
>         > > David Winsemius, MD
>         > > West Hartford, CT
>         > >
>         > >
>         >
>         >       [[alternative HTML version deleted]]
>         >
>         > ______________________________________________
>         > R-help at r-project.org mailing list
>         > https://stat.ethz.ch/mailman/listinfo/r-help
>         http://www.R-project.org/posting-guide.html
>         > and provide commented, minimal, self-contained, reproducible
>         code.
>
>
>         --
>         %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
>         %~%~%~%
>          Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>          ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>          Pearson Building,             [e]
>         gavin.simpsonATNOSPAMucl.ac.uk
>          Gower Street, London          [w]
>         http://www.ucl.ac.uk/~ucfagls/
>          UK. WC1E 6BT.                 [w]
>         http://www.freshwaters.org.uk
>         %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
>         %~%~%~%
>
>
>

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

```