[R] Question about computing offsets automatically
Marc Schwartz
MSchwartz at medanalytics.com
Thu Nov 6 21:31:49 CET 2003
On Thu, 2003-11-06 at 13:33, Louisell, Paul T. wrote:
> Hi,
>
> I'm using R version 1.8.0 on Windows NT. When fitting a glm with Poisson
> random component and a log link, I frequently need to include an offset.
> Typically I use xtabs or table to get the counts for the contingency table,
> and then I use as.data.frame.table to create a data frame that I can use in
> the glm function. I have not found an option that allows me to total the
> offset variable to obtain offsets for cells in the contingency table.
>
> For example, suppose I have the following data frame named Data:
>
> F1 F2 Off
> 1 A C 4
> 2 A C 3
> 3 A C 2
> 4 B C 3
> 5 A D 2
> 6 A D 4
> 7 B D 1
>
> xtabs(~F1+F2, data=Data) produces the contingency table:
>
> F2
> F1 C D
> A 3 2
> B 1 1
>
> And as.data.frame.table(xtabs(~F1+F2, data=Data)) changes the contingency
> table to a data frame suitable for use in the glm function:
>
> F1 F2 Freq
> 1 A C 3
> 2 B C 1
> 3 A D 2
> 4 B D 1
>
> What I'm looking for is some option that would add a 4th column to the
> output of as.data.frame.table which contains the offsets for each cell in
> the contingency table:
>
> F1 F2 Freq Off
> 1 A C 3 9
> 2 B C 1 3
> 3 A D 2 6
> 4 B D 1 1
>
> Does such an option exist somewhere in R (I wasn't able to find it in the
> documentation for the table, xtabs, as.data.frame.table, or glm functions)?
> I can obtain the Off column easily enough in a simple loop, but I thought
> there might be an option for this somewhere.
I don't know of an easy 'option' approach, but you can use aggregate()
to get the sums and then do a cbind() to add the fourth column:
> aggregate(Data$Off, list(F1 = Data$F1, F2 = Data$F2), sum)
F1 F2 x
1 A C 9
2 B C 3
3 A D 6
4 B D 1
So:
> df <- as.data.frame.table(xtabs(~F1+F2, data = Data))
> df
F1 F2 Freq
1 A C 3
2 B C 1
3 A D 2
4 B D 1
> Off <- aggregate(Data$Off, list(F1 = Data$F1, F2 = Data$F2), sum)$x
> Off
[1] 9 3 6 1
> cbind(df, Off)
F1 F2 Freq Off
1 A C 3 9
2 B C 1 3
3 A D 2 6
4 B D 1 1
HTH,
Marc Schwartz
More information about the R-help
mailing list