[R] Summary using by() returns character arrays in a list
PIKAL Petr
petr.pikal at precheza.cz
Wed Oct 10 15:43:12 CEST 2012
Hi
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Alex van der Spek
> Sent: Wednesday, October 10, 2012 2:48 PM
> To: r-help at r-project.org
> Subject: [R] Summary using by() returns character arrays in a list
>
> I use by() to generate a summary statistics like so:
>
> Lbys <- by(dat[Nidx], dat$LipTest, summary)
>
> where Nidx is an index vector with names picking out the columns in the
> data frame dat.
>
> This returns a list of character arrays (see below for str() output)
> where the columns are named correctly but the rownames are empty
> strings and the values are strings prepended with the summary
> statistic's name (e.g.
> "Min.", "Median ").
Without knowledge of your data it is difficult to understand what is wrong.
If I use iris data set as input everything goes as expected
data(iris)
> summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
Median :5.800 Median :3.000 Median :4.350 Median :1.300
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
Species
setosa :50
versicolor:50
virginica :50
> by(iris, iris$Species, summary)
iris$Species: setosa
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min. :4.300 Min. :2.300 Min. :1.000 Min. :0.100
1st Qu.:4.800 1st Qu.:3.200 1st Qu.:1.400 1st Qu.:0.200
Median :5.000 Median :3.400 Median :1.500 Median :0.200
Mean :5.006 Mean :3.428 Mean :1.462 Mean :0.246
3rd Qu.:5.200 3rd Qu.:3.675 3rd Qu.:1.575 3rd Qu.:0.300
Max. :5.800 Max. :4.400 Max. :1.900 Max. :0.600
Species
setosa :50
versicolor: 0
virginica : 0
>
> I am reading the code of summary.data.frame() but can't figure out how
> I can change the action of that function to return list of numeric
> matrices with as rownames the summary statistic's name ("Min.", "Max."
> etc) and as values the numeric values of the calculated summary
> statistic.
Just what do you not like on such output and how do you want the output structured?
Maybe you want aggregate, but without simple data it is hard to say.
aggregate(iris[1:2], list(iris$Species), summary)
Regards
Petr
>
> Any help much appreciated!
> Regards,
> Alex van der Spek
>
>
> > str(Lbys)
> List of 2
> $ : 'table' chr [1:6, 1:19] "Min. :-0.190 " "1st Qu.: 9.297 "
> "Median :10.373 " "Mean :10.100 " ...
> ..- attr(*, "dimnames")=List of 2
> .. ..$ : chr [1:6] "" "" "" "" ...
> .. ..$ : chr [1:19] "Cell_3_SOS....GVF." "Cell_3_SOSq..ms.ms."
> "Cell_3_Airflow..cfm." "Cell_3_Float..in.." ...
> $ T38: 'table' chr [1:6, 1:19] "Min. :8.648 " "1st Qu.:8.920 "
> "Median :9.018 " "Mean :9.027 " ...
> ..- attr(*, "dimnames")=List of 2
> .. ..$ : chr [1:6] "" "" "" "" ...
> .. ..$ : chr [1:19] "Cell_3_SOS....GVF." "Cell_3_SOSq..ms.ms."
> "Cell_3_Airflow..cfm." "Cell_3_Float..in.." ...
> - attr(*, "dim")= int 2
> - attr(*, "dimnames")=List of 1
> ..$ dat$LipTest: chr [1:2] "" "T38"
> - attr(*, "call")= language by.data.frame(data = dat[Nidx], INDICES =
> dat$LipTest, FUN = summary)
> - attr(*, "class")= chr "by"
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list