[R] ggplot2: stat_smooth for family=binomial with cbind(Y, N) formula

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Wed Dec 18 00:11:29 CET 2013


Dear Michael,

have you tried the fullrange argument of stat_smooth?

ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
     geom_point() +
     geom_smooth(method = "glm", family = binomial, aes(weight = trials), fullrange = TRUE)

Best regrads,

Thierry
________________________________________
Van: Michael Friendly [friendly op yorku.ca]
Verzonden: dinsdag 17 december 2013 19:42
Aan: ONKELINX, Thierry; R-help
Onderwerp: Re: [R] ggplot2: stat_smooth for family=binomial with cbind(Y,       N) formula

Thanks very much for this helpful reply, Thierry

Using aes(weight=trials) in stat_smooth() was part of what I was missing
and solves my main
question.
However, for this data, I want to show the extrapolated prediction over
a wider range than in
the data.  Adding xlim() doesn't help here-- the plot annotations are
cut off at the lowest
value of Temperature in the data. Is there another way?

ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
     xlim(30, 81) +
     geom_point() +
     geom_smooth(method = "glm", family = binomial, aes(weight = trials))

Below is the complete (but messy) code for the plot() I would like to
more or less replicate using
ggplot()

data("SpaceShuttle", package="vcd")
logit2p <- function(logit) 1/(1 + exp(-logit))

plot(nFailures/6 ~ Temperature, data = SpaceShuttle,
      xlim = c(30, 81), ylim = c(0,1),
      main = "NASA Space Shuttle O-Ring Failures",
      ylab = "Estimated failure probability",
      xlab = "Temperature (degrees F)",
      type="n")  # painters model: add points last

fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature,
           data = SpaceShuttle,
           family = binomial)
pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)

predicted <- data.frame(
     Temperature = 30 : 81,
     prob = logit2p(pred$fit),
     lower = logit2p(pred$fit - 2*pred$se),
     upper = logit2p(pred$fit + 2*pred$se)
     )

with(predicted, {
     polygon(c(Temperature, rev(Temperature)),
             c(lower, rev(upper)), col="lightpink", border=NA)
     lines(Temperature, prob, lwd=3)
     }
     )
with(SpaceShuttle,
     points(Temperature, nFailures/6, col="blue", pch=19, cex=1.3)
     )

I also tried following the example in given in ?geom_smooth, to supply
the predicted
values over my range of x values, and lower/upper limits in a separate
data frame:

pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)
predicted <- data.frame(
     Temperature = 30 : 81,
     prob = logit2p(pred$fit),
     lower = logit2p(pred$fit - 2*pred$se),
     upper = logit2p(pred$fit + 2*pred$se)
     )
ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
   geom_smooth(aes(ymin = lower, ymax = upper), data=predicted,
stat="identity")

but this gives an error:

 > ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
+   geom_smooth(aes(ymin = lower, ymax = upper), data=predicted,
stat="identity")
Error in eval(expr, envir, enclos) : object 'nFailures' not found
 >

-Michael


On 12/17/2013 11:26 AM, ONKELINX, Thierry wrote:
> Dear Michael,
>
> Calculate the propotions. Then it is easy to use the weight option of glm
>
> data("SpaceShuttle", package="vcd")
> SpaceShuttle$trials <- 6
>
> fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature, data = SpaceShuttle, family = binomial)
> fm2 <- glm(nFailures/trials ~ Temperature, data = SpaceShuttle, family = binomial, weight = trials)
> all.equal(coef(fm), coef(fm2))
>
> ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) + geom_point() + geom_smooth(method = "glm", family = binomial, aes(weight = trials))
>
> Best regards,
>
> Thierry
>
> -----Oorspronkelijk bericht-----
> Van: r-help-bounces op r-project.org [mailto:r-help-bounces op r-project.org] Namens Michael Friendly
> Verzonden: dinsdag 17 december 2013 14:58
> Aan: R-help
> Onderwerp: [R] ggplot2: stat_smooth for family=binomial with cbind(Y, N) formula
>
> With ggplot2, I can plot the glm stat_smooth for binomial data when the response is binary or a two-level factor as follows:
>
> data("Donner", package="vcdExtra")
> ggplot(Donner, aes(age, survived)) +
> geom_point(position = position_jitter(height = 0.02, width = 0)) + stat_smooth(method = "glm", family = binomial, formula = y ~ x, alpha = 0.2, size=2)
>
> But how can I specify the formula for stat_smooth when the response is cbind(successes, failures)?
> The equivalent with plot (minus the confidence band) for the example I want is:
>
> data("SpaceShuttle", package="vcd")
>
>   > head(SpaceShuttle, 5)
>     FlightNumber Temperature Pressure Fail nFailures Damage
> 1            1          66       50   no         0      0
> 2            2          70       50  yes         1      4
> 3            3          69       50   no         0      0
> 4            4          80       50 <NA>        NA     NA
> 5            5          68       50   no         0      0
>   >
>
> plot(nFailures/6 ~ Temperature, data = SpaceShuttle,
>        xlim = c(30, 81), ylim = c(0,1),
>        main = "NASA Space Shuttle O-Ring Failures",
>        ylab = "Estimated failure probability",
>        xlab = "Temperature (degrees F)",
>        pch = 19, col = "blue", cex=1.2)
> fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature,
>             data = SpaceShuttle,
>             family = binomial)
> pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)
> lines(30 : 81,
>         predict(fm, data.frame(Temperature = 30 : 81), type = "response"),
>         lwd = 3)
>
> --
> Michael Friendly     Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
> 4700 Keele Street    Web:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
>
> ______________________________________________
>

--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.



More information about the R-help mailing list