[R] How to separate a data set by its factors
James Rome
jamesrome at gmail.com
Fri Dec 25 15:38:18 CET 2009
Thanks for the help.
I tried making the pdf file as suggested. Acrobat said it was damaged
and could not be opened. Is this an R bug?
It did make a PostScript file that I was able to distill into PDF, but
it was gray scales. How do I get the color back?
And yes, I did do the layout I wanted so I could see how the days
compared for each hour.
On 12/24/09 4:56 PM, David Winsemius wrote:
>
>
> pdf(test.pdf")
> xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width |
> Species, data = iris, scales = "free", layout = c(2, 1, 2), auto.key =
> list(x = .6, y = .7, corner = c(0, 0)))
> dev.off()
> You may not be getting what you expect, but it may be that your plots
> are all being created, but too quickly to be seen. Try printing to a
> more durable "canvas".
>
>> And I would like to add a Poisson Distribution fit to each of these
>> plots (see below), but am clueless as to how to go about it.
>>
>> I would like to fit a distribution to the count data for each
>> combination of day and hour, and I am unable to see how to do this in a
>> vector manner. For example, I tried
>> density((Arrival.Val | DAY*Hour), na.rm=TRUE)
>> which does not work.
>
> I should think the this would be informative:
>
> glm(Arrival.Val ~ DAY*Hour, family="poisson")
>
> Since DAY and Hour are factors you will get a large number of
> estimates. You can use the typical regression functions, such as
> predict() and summary() to get the fitted values.
>
I tried glm:
---------
> glm(Arrival.Val ~ DAY*as.factor(Hour), family="poisson")
Call: glm(formula = Arrival.Val ~ DAY * as.factor(Hour), family =
"poisson")
Coefficients:
(Intercept)
DAY[T.Monday]
3.15396
-0.61348
DAY[T.Saturday]
DAY[T.Sunday]
-0.43853
-0.93475
DAY[T.Thursday]
DAY[T.Tuesday]
-0.23109
-0.38137
DAY[T.Wednesday]
as.factor(Hour)[T.1]
-0.35715
-1.01389
as.factor(Hour)[T.2]
as.factor(Hour)[T.3]
-1.07451
-0.69315
as.factor(Hour)[T.4]
as.factor(Hour)[T.5]
-0.87384
-0.57808
as.factor(Hour)[T.6]
as.factor(Hour)[T.7]
-0.41122
0.26453
as.factor(Hour)[T.8]
as.factor(Hour)[T.9]
-0.08802
-0.01618
as.factor(Hour)[T.10]
as.factor(Hour)[T.11]
0.33495
0.40389
as.factor(Hour)[T.12]
as.factor(Hour)[T.13]
0.43834
0.49019
as.factor(Hour)[T.14]
as.factor(Hour)[T.15]
0.56895
0.54856
as.factor(Hour)[T.16]
as.factor(Hour)[T.17]
0.50895
0.49770
as.factor(Hour)[T.18]
as.factor(Hour)[T.19]
0.49879
0.41296
as.factor(Hour)[T.20]
as.factor(Hour)[T.21]
0.37310
0.26455
as.factor(Hour)[T.22]
as.factor(Hour)[T.23]
0.14955
0.07016
DAY[T.Monday]:as.factor(Hour)[T.1]
DAY[T.Saturday]:as.factor(Hour)[T.1]
1.02978
0.81973
DAY[T.Sunday]:as.factor(Hour)[T.1]
DAY[T.Thursday]:as.factor(Hour)[T.1]
0.58645
0.17046
DAY[T.Tuesday]:as.factor(Hour)[T.1]
DAY[T.Wednesday]:as.factor(Hour)[T.1]
0.66905
0.63300
DAY[T.Monday]:as.factor(Hour)[T.2]
DAY[T.Saturday]:as.factor(Hour)[T.2]
0.61348 NA
. . . .
DAY[T.Tuesday]:as.factor(Hour)[T.22]
DAY[T.Wednesday]:as.factor(Hour)[T.22]
0.37518
0.34362
DAY[T.Monday]:as.factor(Hour)[T.23]
DAY[T.Saturday]:as.factor(Hour)[T.23]
0.52431
0.04906
DAY[T.Sunday]:as.factor(Hour)[T.23]
DAY[T.Thursday]:as.factor(Hour)[T.23]
0.68802
0.39860
DAY[T.Tuesday]:as.factor(Hour)[T.23]
DAY[T.Wednesday]:as.factor(Hour)[T.23]
0.43209
0.49274
Degrees of Freedom: 8124 Total (i.e. Null); 7963 Residual
(18 observations deleted due to missingness)
Null Deviance: 40120
Residual Deviance: 17030 AIC: 59170
----------------
I am not sure what to make of this.
So how do I get the fit plotted on top of my histograms?
Is there a way to save the bin data from the histogram command?
>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
Again Thanks for the prompt holiday response.
Jim Rome
More information about the R-help
mailing list