[R] Summary stats in table
Duncan Murdoch
murdoch.duncan at gmail.com
Sun Oct 30 20:09:01 CET 2011
On 11-10-24 7:16 PM, Hadley Wickham wrote:
> On Mon, Oct 24, 2011 at 5:39 AM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
>> Suppose I have data like this:
>>
>> A<- sample(letters[1:3], 1000, replace=TRUE)
>> B<- sample(LETTERS[1:2], 1000, replace=TRUE)
>> x<- rnorm(1000)
>>
>> I can get a table of means via
>>
>> tapply(x, list(A, B), mean)
>>
>> and I can add the marginal means to this using cbind/rbind:
>>
>> main<- tapply(x, list(A,B), mean)
>> Amargin<- tapply(x, list(A), mean)
>> Bmargin<- tapply(x, list(B), mean)
>>
>> rbind(cbind(main, all=Amargin),all=c(Bmargin, mean(x)))
>>
>> But this is tedious. Has some package got some code that makes this easier?
>
> Have a look at reshape2::add_margins - it's not super efficient, but I
> think cool because it works for arbitrarily many dimensions.
>
> Hadley
>
I actually was hoping for something more like what I remember vaguely
from SAS PROC TABULATE, which I haven't used in about 20 years. Anyway,
I decided to go ahead and write it; it's fun enough that I'll probably
put it on CRAN eventually. Here's what it currently gives:
> tabular( x*mean*(A+1) ~ (B+1) )
B
A B All
x mean A a 0.05058 0.02308 0.036279
b 0.02878 0.01188 0.020953
c 0.06869 -0.08192 -0.003033
All 0.04906 -0.01511 0.017875
and here's a more elaborate example:
> example(tabular)
tabulr> tabular( (Species + 1) ~ (n=1) + Format(digits=2)*
tabulr+ (Sepal.Length + Sepal.Width)*(mean + sd), data=iris )
Sepal.Length Sepal.Width
n mean sd mean sd
Species setosa 50 5.01 0.35 3.43 0.38
versicolor 50 5.94 0.52 2.77 0.31
virginica 50 6.59 0.64 2.97 0.32
All 150 5.84 0.83 3.06 0.44
Duncan Murdoch
More information about the R-help
mailing list