[R] basic table statistics
David Winsemius
dwinsemius at comcast.net
Sat Apr 24 23:25:14 CEST 2010
On Apr 23, 2010, at 3:48 PM, Maxim wrote:
> I have a very simple question, but I'm obviously not able to solve the
> problem on my own.
>
> I have a data.frame like
>
> sample(c("A","B","C"),size=20,replace = T)->type
>
> rnorm(20)->value
>
> data.frame(ty=type,val=value)->test
>
> There must be some built in functions, that will do some descriptive
> statistics with tabular output, in the end I like to have something
> like
>
> number of samples mean sd .............
>
> A 5
> B 9
> C 6
>
> So I need a function that counts the number of occurrences of
> factors in
> type and then does something like the *summary* function, but factor
> specific.
>
> I tried:
> vector()->Median
> vector()->SD
> vector()->Mean
>
> as.data.frame(table(type))->int
> for (count in c(1:(nrow(int))))
> {
> subset(test, ty==as.character(int$type[count])) -> subtest
> median(subtest$val)->Median[count]
> sd(subtest$val)->SD[count]
> mean(subtest$val)->Mean[count]
> }
>
>
> cbind(int,Median,SD,Mean)
> require(Design) # loads Hmisc which has ne of many version of
describe()
> describe(test)
test
2 Variables 20 Observations
-------------------------------------------------------------------------
ty
n missing unique
20 0 3
A (4, 20%), B (5, 25%), C (11, 55%)
-------------------------------------------------------------------------
val
n missing unique Mean .05 .10 .25
20 0 20 0.07383 -0.865776 -0.815317 -0.707465
.50 .75 .90 .95
0.005735 0.634226 1.270066 1.771820
lowest : -1.7965 -0.8168 -0.8152 -0.8040 -0.7170
highest: 0.6790 1.0680 1.2149 1.7665 1.8729
-------------------------------------------------------------------------
> require(doBy)
> summaryBy(value~ty, test, FUN=list(length, mean, min, max, sd,
median))
ty value.length value.mean value.min value.max value.sd
1 A 4 -0.03442822 -0.8151531 1.766502 1.2258221
2 B 5 0.34541927 -0.8167919 1.214906 0.7647165
3 C 11 -0.01025352 -1.7964684 1.872865 1.0109676
value.median
1 -0.54453098
2 0.57020532
3 -0.06826249
The by() function which is an application of tapply can also be used.
>
>
>
> This works, but: isn't this much too complicated, I bet there is such
> functionality embedded in the base packages, but I cannot find it.
>
>
> Maxim
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list