[R] ggplot2: stat_smooth for family=binomial with cbind(Y, N) formula
ONKELINX, Thierry
Thierry.ONKELINX at inbo.be
Wed Dec 18 00:11:29 CET 2013
Dear Michael,
have you tried the fullrange argument of stat_smooth?
ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
geom_point() +
geom_smooth(method = "glm", family = binomial, aes(weight = trials), fullrange = TRUE)
Best regrads,
Thierry
________________________________________
Van: Michael Friendly [friendly op yorku.ca]
Verzonden: dinsdag 17 december 2013 19:42
Aan: ONKELINX, Thierry; R-help
Onderwerp: Re: [R] ggplot2: stat_smooth for family=binomial with cbind(Y, N) formula
Thanks very much for this helpful reply, Thierry
Using aes(weight=trials) in stat_smooth() was part of what I was missing
and solves my main
question.
However, for this data, I want to show the extrapolated prediction over
a wider range than in
the data. Adding xlim() doesn't help here-- the plot annotations are
cut off at the lowest
value of Temperature in the data. Is there another way?
ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
xlim(30, 81) +
geom_point() +
geom_smooth(method = "glm", family = binomial, aes(weight = trials))
Below is the complete (but messy) code for the plot() I would like to
more or less replicate using
ggplot()
data("SpaceShuttle", package="vcd")
logit2p <- function(logit) 1/(1 + exp(-logit))
plot(nFailures/6 ~ Temperature, data = SpaceShuttle,
xlim = c(30, 81), ylim = c(0,1),
main = "NASA Space Shuttle O-Ring Failures",
ylab = "Estimated failure probability",
xlab = "Temperature (degrees F)",
type="n") # painters model: add points last
fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature,
data = SpaceShuttle,
family = binomial)
pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)
predicted <- data.frame(
Temperature = 30 : 81,
prob = logit2p(pred$fit),
lower = logit2p(pred$fit - 2*pred$se),
upper = logit2p(pred$fit + 2*pred$se)
)
with(predicted, {
polygon(c(Temperature, rev(Temperature)),
c(lower, rev(upper)), col="lightpink", border=NA)
lines(Temperature, prob, lwd=3)
}
)
with(SpaceShuttle,
points(Temperature, nFailures/6, col="blue", pch=19, cex=1.3)
)
I also tried following the example in given in ?geom_smooth, to supply
the predicted
values over my range of x values, and lower/upper limits in a separate
data frame:
pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)
predicted <- data.frame(
Temperature = 30 : 81,
prob = logit2p(pred$fit),
lower = logit2p(pred$fit - 2*pred$se),
upper = logit2p(pred$fit + 2*pred$se)
)
ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
geom_smooth(aes(ymin = lower, ymax = upper), data=predicted,
stat="identity")
but this gives an error:
> ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
+ geom_smooth(aes(ymin = lower, ymax = upper), data=predicted,
stat="identity")
Error in eval(expr, envir, enclos) : object 'nFailures' not found
>
-Michael
On 12/17/2013 11:26 AM, ONKELINX, Thierry wrote:
> Dear Michael,
>
> Calculate the propotions. Then it is easy to use the weight option of glm
>
> data("SpaceShuttle", package="vcd")
> SpaceShuttle$trials <- 6
>
> fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature, data = SpaceShuttle, family = binomial)
> fm2 <- glm(nFailures/trials ~ Temperature, data = SpaceShuttle, family = binomial, weight = trials)
> all.equal(coef(fm), coef(fm2))
>
> ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) + geom_point() + geom_smooth(method = "glm", family = binomial, aes(weight = trials))
>
> Best regards,
>
> Thierry
>
> -----Oorspronkelijk bericht-----
> Van: r-help-bounces op r-project.org [mailto:r-help-bounces op r-project.org] Namens Michael Friendly
> Verzonden: dinsdag 17 december 2013 14:58
> Aan: R-help
> Onderwerp: [R] ggplot2: stat_smooth for family=binomial with cbind(Y, N) formula
>
> With ggplot2, I can plot the glm stat_smooth for binomial data when the response is binary or a two-level factor as follows:
>
> data("Donner", package="vcdExtra")
> ggplot(Donner, aes(age, survived)) +
> geom_point(position = position_jitter(height = 0.02, width = 0)) + stat_smooth(method = "glm", family = binomial, formula = y ~ x, alpha = 0.2, size=2)
>
> But how can I specify the formula for stat_smooth when the response is cbind(successes, failures)?
> The equivalent with plot (minus the confidence band) for the example I want is:
>
> data("SpaceShuttle", package="vcd")
>
> > head(SpaceShuttle, 5)
> FlightNumber Temperature Pressure Fail nFailures Damage
> 1 1 66 50 no 0 0
> 2 2 70 50 yes 1 4
> 3 3 69 50 no 0 0
> 4 4 80 50 <NA> NA NA
> 5 5 68 50 no 0 0
> >
>
> plot(nFailures/6 ~ Temperature, data = SpaceShuttle,
> xlim = c(30, 81), ylim = c(0,1),
> main = "NASA Space Shuttle O-Ring Failures",
> ylab = "Estimated failure probability",
> xlab = "Temperature (degrees F)",
> pch = 19, col = "blue", cex=1.2)
> fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature,
> data = SpaceShuttle,
> family = binomial)
> pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)
> lines(30 : 81,
> predict(fm, data.frame(Temperature = 30 : 81), type = "response"),
> lwd = 3)
>
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
> 4700 Keele Street Web: http://www.datavis.ca
> Toronto, ONT M3J 1P3 CANADA
>
> ______________________________________________
>
--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street Web: http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
More information about the R-help
mailing list