[R] Re : Re : descriptive statistics

Mon Dec 13 20:29:25 CET 2010

You could also use aggstat() of package tdisplay (available at
http://forums.cirad.fr/logiciel-R/viewtopic.php?t=3367). See the help
page.

> mydata <- data.frame(
+     y1 = c(NA, rnorm(n = 8, mean = 10, sd = 5), NA),
+     y2 = c(rep(NA, 2), rnorm(n = 6, mean = 10, sd = 5), rep(NA, 2)),
+     y3 = rnorm(n = 10, mean = 10, sd = 5),
+     y4 = rnorm(n = 10, mean = 10, sd = 5),
+     f1 = rep(c("a", NA, "b"), times = c(3, 1, 6)),
+     f2 = rep(c("c", "d", NA), times = c(5, 3, 2)),
+     f3 = rep(c("e", "f", "g"), times = c(3, 3, 4))
+     )
> mydata
           y1        y2        y3        y4   f1   f2 f3
1          NA        NA 11.277582 13.120160    a    c  e
2  -0.7843488        NA 18.633881  9.095533    a    c  e
3  11.6555526 15.563409  9.433654 16.062916    a    c  e
4  12.2523768  5.567119 19.381132 13.734706 <NA>    c  f
5  11.4456084  8.170626  5.039419  7.135086    b    c  f
6  16.1444098  2.518970  7.468279  5.441936    b    d  f
7   9.4774380  5.114297 14.777489  8.884707    b    d  g
8  13.9189684 13.090211 17.060803 12.467241    b    d  g
9  12.0196222        NA  4.551620  9.506194    b <NA>  g
10         NA        NA  8.377446  6.572499    b <NA>  g

> aggstat(formula = cbind(y1, y2, y3) ~ f1 + f2, data = mydata, FUN = mean)

aggstat(formula = cbind(y1, y2, y3) ~ f1 + f2, data = mydata,
    FUN = mean)

  f1 f2        y1        y2        y3
1  a  c  5.435602 15.563409 13.115039
2  b  c 11.445608  8.170626  5.039419
3  b  d 13.180272  6.907826 13.102191

See also the function univar():

> mydata <- data.frame(
+     f1 = c(NA, rep("a", 2), rep("b", 5), NA, "a", "a"),
+     f2 = rep(c("c", "d"), times = c(5, 6)),
+     f3 = rep(c("e", NA, "f"), times = c(4, 1, 6)),
+     y1 = c(rnorm(n = 9, mean = 10, sd = 5), NA, 2.1)
+     )
> mydata
     f1 f2   f3        y1
1  <NA>  c    e  6.948897
2     a  c    e 20.115954
3     a  c    e 13.569935
4     b  c    e 12.159732
5     b  c <NA> 11.862606
6     b  d    f 21.610803
7     b  d    f 10.820413
8     b  d    f 13.200561
9  <NA>  d    f  9.694245
10    a  d    f        NA
11    a  d    f  2.100000

> univar(formula = y1 ~ f1, data = mydata)

  f1 NA's n   min    q25 median   mean    q75    max    sd   iqr  range    cv
1  a    1 3  2.10  7.835  13.57 11.929 16.843 20.116 9.119 9.008 18.016 0.764
2  b    0 5 10.82 11.863  12.16 13.931 13.201 21.611 4.376 1.338 10.790 0.314

> univar(formula = y1 ~ f1 + f2, data = mydata)

  f1 f2 NA's n    min    q25 median   mean    q75    max    sd   iqr
range    cv
1  a  c    0 2 13.570 15.206 16.843 16.843 18.479 20.116 4.629 3.273
6.546 0.275
3  a  d    1 1  2.100  2.100  2.100  2.100  2.100  2.100    NA 0.000
0.000    NA
2  b  c    0 2 11.863 11.937 12.011 12.011 12.085 12.160 0.210 0.149
0.297 0.017
4  b  d    0 3 10.820 12.010 13.201 15.211 17.406 21.611 5.669 5.395
10.790 0.373

-- 
Matthieu Lesnoff
CIRAD
Bamako, Mali

On 13 December 2010 17:04, Ivan Calandra <ivan.calandra at uni-hamburg.de> wrote:
> Do it with aggregate(), something like this should do:
> aggregate(.~cluster, FUN=summary, data=data)
>
> Now if you don't want to run summary(), replace it with the function you'd
> like.
>
> HTH,
> Ivan
>
> Le 12/13/2010 17:17, effeesse a écrit :
>>
>> what am I supposed to put into function(x)? The indicator for extracting
>> the
>> subgroups?
>> data is the df. cluster={1,...,14}.
>>
>> This is how I was compiling:
>>
>> "for (i in 1:14) {
>> my.summary<-data$cluster==i c(mean(?),var(?))
>>
>> summary(var_A~cluster, fun=my.summary,data=data)
>> summary(var_B~cluster, fun=my.summary,data=data)
>> summary(var_C~cluster, fun=my.summary,data=data)
>> summary(var_D~cluster, fun=my.summary,data=data)
>> summary(var_E~cluster, fun=my.summary,data=data)
>> summary(var_F~cluster, fun=my.summary,data=data)
>> summary(var_G~cluster, fun=my.summary,data=data)
>> }"
>>
>> thanks for your patience.
>
> --
> Ivan CALANDRA
> PhD Student
> University of Hamburg
> Biozentrum Grindel und Zoologisches Museum
> Abt. Säugetiere
> Martin-Luther-King-Platz 3
> D-20146 Hamburg, GERMANY
> +49(0)40 42838 6231
> ivan.calandra at uni-hamburg.de
>
> **********
> http://www.for771.uni-bonn.de
> http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>