[R] How to separate a data set by its factors
David Winsemius
dwinsemius at comcast.net
Thu Dec 24 22:56:42 CET 2009
On Dec 24, 2009, at 3:24 PM, James Rome wrote:
> I have a large data set of airport data and wish to analyze it by hour
> and day of the week. hour and day of the week are factors.
>
> I can do something such as:
> histogram(~() | , type="count", breaks=60)
> which displays the data the way I want it in principle, but the plots
> are too small to read. I added layout=c(7,6,4) to the argument list,
> but
> then I only get the first page of plots. How do I see the other pages?
I was not aware that layout had a paging argument, but that just shows
you that there are large gaps in my knowledge. if I munge one of the
examples on the xyplot help page I get (ugly) multi-page output;
pdf(test.pdf")
xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width |
Species, data = iris, scales = "free", layout = c(2, 1, 2), auto.key =
list(x = .6, y = .7, corner = c(0, 0)))
dev.off()
You may not be getting what you expect, but it may be that your plots
are all being created, but too quickly to be seen. Try printing to a
more durable "canvas".
> And I would like to add a Poisson Distribution fit to each of these
> plots (see below), but am clueless as to how to go about it.
>
> I would like to fit a distribution to the count data for each
> combination of day and hour, and I am unable to see how to do this
> in a
> vector manner. For example, I tried
> density((Arrival.Val | DAY*Hour), na.rm=TRUE)
> which does not work.
I should think the this would be informative:
glm(Arrival.Val ~ DAY*Hour, family="poisson")
Since DAY and Hour are factors you will get a large number of
estimates. You can use the typical regression functions, such as
predict() and summary() to get the fitted values.
>
> I think my question boils down to "how do you replace a whole data set
> by its factored subsets in all of the usual R commands?
>
> I am climbing up a steep R learning curve, and so would appreciate
> some
> help.
>
> Thanks,
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list