[R] plotting average effects.
Dennis Murphy
djmuser at gmail.com
Fri Oct 21 19:21:07 CEST 2011
Hi:
Your approach to computing the means is not efficient; a better way
would be to use the aggregate() function. I would start by combining
the grouping variable and the three prediction variables into a data
frame. To get the groupwise mean for all three prediction variables,
you can use a formula interface for aggregate() if you have R-2.11.0
or later, cbinding the three prediction variables into a matrix on the
LHS of the model formula, the grouping variable on the RHS, followed
by the data frame name and the summary function. See ?aggregate for
details; in particular, study the examples with a formula interface.
Save the result to an object. Since this is homework, the details are
left to you.
As far as the base graphics plot goes, I suggest the following:
- use arrows() to produce the lines with arrows
- plot the means by group as points with the points() function.
The arrows() function can take vector arguments; read its help page carefully.
The ggplot2 version of the plot I think you're trying to generate is
given below:
library('ggplot2')
ggplot(pmeans) +
geom_point(aes(x = grp, y = pred1), colour = 'red') +
geom_segment(aes(x = grp, xend = grp, y = pred3, yend = pred2),
arrow = arrow(length = unit(0.4, 'cm')), colour =
'red', size = 1)
pmeans is the name I gave for the averaged predictions by group, with
grp representing the grouping variable and pred1-pred3 per your
definitions.
In addition to the aggregate() and apply family functions, the
packages doBy, plyr and data.table are well designed for groupwise
data summarization and processing.
HTH,
Dennis
On Fri, Oct 21, 2011 at 8:51 AM, gradstudent <nmford at uwm.edu> wrote:
> i will include the data to read if if you so choose.
>
> dat <- read.dta("http://quantoid.net/hw1_2011.dta")
>
> model in question:
>
> mod99 <- glm(democracy ~ popc100kpc + ngrevpc, data=dat, family=binomial)
> ------------------
>
> looking for average effects code, with error on mod99. popckpc is coded in
> 1k per capita.
>
>> dat3$popc100kpc <- dat$popc100kpc - 100
>> dat3$popc100kpc[which(dat3$popc100kpc < min(dat$popc100kpc))] <-
>> min(dat$popc100kpc)
>> dat2 <- dat3 <- dat
>> dat2$popc100kpc <- dat2$popc100kpc + 100
>> dat2$popc100kpc[which(dat2$popc100kpc > max(dat$popc100kpc))] <-
>> max(dat$popc100kpc)
>> dat3$popc100kpc <- dat$popc100kpc - 100
>> dat3$popc100kpc[which(dat3$popc100kpc < min(dat$popc100kpc))] <-
>> min(dat$popc100kpc)
>> pred1 <- predict(mod99, type="response")
>> pred2 <- predict(mod99, newdata=dat2, type="response")
>> pred3 <- predict(mod99, newdata=dat3, type="response")
> ----
> breaking the variable into groups:
>
>> pop1.group <- cut(dat$popc100kpc, breaks=quantile(dat$popc100kpc,
>> seq(0,1,by=.25)), include.lowest=T)
>> apply, 2, mean)
>
>> means <- by(cbind(pred1, pred2, pred3), list(pop1.group),
> + apply, 2, mean)
>> means <- do.call(rbind, means)
>
>
> ----
> and finally attempting to plot:
>
>> par(mar=c(7,4,4,2))
>> plot(c(1,10), range(c(means)), type="n", xlab="",
> + ylab="Predicted Probability", axes=F)
>> arrows(1:10, means[,1], 1:10, means[,2], code=2, length=.1)
>> arrows(1:10, means[,1], 1:10, means[,3], code=2, length=.1, col="red")
>> points(1:10, means[,1], pch=16)
> Error in xy.coords(x, y) : 'x' and 'y' lengths differ
>> axis(1, at=1:10, labels=rownames(means), las=2)
> Error in axis(1, at = 1:10, labels = rownames(means), las = 2) :
> 'at' and 'labels' lengths differ, 10 != 4
> --------
>
>
> I am not sure how to fix the error. Thank you for your time.
>
>
>
>
> -----
> Ph.D. Candidate
> --
> View this message in context: http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925945.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list