Thanks very much for this helpful reply, Thierry
Using aes(weight=trials) in stat_smooth() was part of what I was missing
and solves my main
question.
However, for this data, I want to show the extrapolated prediction over
a wider range than in
the data. Adding xlim() doesn't help here-- the plot annotations are
cut off at the lowest
value of Temperature in the data. Is there another way?
ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
xlim(30, 81) +
geom_point() +
geom_smooth(method = "glm", family = binomial, aes(weight = trials))
Below is the complete (but messy) code for the plot() I would like to
more or less replicate using
ggplot()
data("SpaceShuttle", package="vcd")
logit2p <- function(logit) 1/(1 + exp(-logit))
plot(nFailures/6 ~ Temperature, data = SpaceShuttle,
xlim = c(30, 81), ylim = c(0,1),
main = "NASA Space Shuttle O-Ring Failures",
ylab = "Estimated failure probability",
xlab = "Temperature (degrees F)",
type="n") # painters model: add points last
fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature,
data = SpaceShuttle,
family = binomial)
pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)
predicted <- data.frame(
Temperature = 30 : 81,
prob = logit2p(pred$fit),
lower = logit2p(pred$fit - 2*pred$se),
upper = logit2p(pred$fit + 2*pred$se)
)
with(predicted, {
polygon(c(Temperature, rev(Temperature)),
c(lower, rev(upper)), col="lightpink", border=NA)
lines(Temperature, prob, lwd=3)
}
)
with(SpaceShuttle,
points(Temperature, nFailures/6, col="blue", pch=19, cex=1.3)
)
I also tried following the example in given in ?geom_smooth, to supply
the predicted
values over my range of x values, and lower/upper limits in a separate
data frame:
pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)
predicted <- data.frame(
Temperature = 30 : 81,
prob = logit2p(pred$fit),
lower = logit2p(pred$fit - 2*pred$se),
upper = logit2p(pred$fit + 2*pred$se)
)
ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
geom_smooth(aes(ymin = lower, ymax = upper), data=predicted,
stat="identity")
but this gives an error:
> ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) +
+ geom_smooth(aes(ymin = lower, ymax = upper), data=predicted,
stat="identity")
Error in eval(expr, envir, enclos) : object 'nFailures' not found
>
-Michael
On 12/17/2013 11:26 AM, ONKELINX, Thierry wrote:
Dear Michael,
Calculate the propotions. Then it is easy to use the weight option of glm
data("SpaceShuttle", package="vcd")
SpaceShuttle$trials <- 6
fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature, data = SpaceShuttle,
family = binomial)
fm2 <- glm(nFailures/trials ~ Temperature, data = SpaceShuttle, family =
binomial, weight = trials)
all.equal(coef(fm), coef(fm2))
ggplot(SpaceShuttle, aes(x = Temperature, y = nFailures / trials)) + geom_point() +
geom_smooth(method = "glm", family = binomial, aes(weight = trials))
Best regards,
Thierry
-----Oorspronkelijk bericht-----
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens
Michael Friendly
Verzonden: dinsdag 17 december 2013 14:58
Aan: R-help
Onderwerp: [R] ggplot2: stat_smooth for family=binomial with cbind(Y, N) formula
With ggplot2, I can plot the glm stat_smooth for binomial data when the
response is binary or a two-level factor as follows:
data("Donner", package="vcdExtra")
ggplot(Donner, aes(age, survived)) +
geom_point(position = position_jitter(height = 0.02, width = 0)) + stat_smooth(method =
"glm", family = binomial, formula = y ~ x, alpha = 0.2, size=2)
But how can I specify the formula for stat_smooth when the response is
cbind(successes, failures)?
The equivalent with plot (minus the confidence band) for the example I want is:
data("SpaceShuttle", package="vcd")
> head(SpaceShuttle, 5)
FlightNumber Temperature Pressure Fail nFailures Damage
1 1 66 50 no 0 0
2 2 70 50 yes 1 4
3 3 69 50 no 0 0
4 4 80 50 <NA> NA NA
5 5 68 50 no 0 0
>
plot(nFailures/6 ~ Temperature, data = SpaceShuttle,
xlim = c(30, 81), ylim = c(0,1),
main = "NASA Space Shuttle O-Ring Failures",
ylab = "Estimated failure probability",
xlab = "Temperature (degrees F)",
pch = 19, col = "blue", cex=1.2)
fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature,
data = SpaceShuttle,
family = binomial)
pred <- predict(fm, data.frame(Temperature = 30 : 81), se=TRUE)
lines(30 : 81,
predict(fm, data.frame(Temperature = 30 : 81), type = "response"),
lwd = 3)
--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street Web: http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA
______________________________________________
--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street Web: http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.