[Rd] R-devel Digest, Vol 181, Issue 22
Randall Pruim
rpruim at calvin.edu
Sun Mar 25 15:56:22 CEST 2018
Thanks.
I am fully aware of what aggregate() returnes, and I can post-process this into the form I want — if the names are available.
But for foo, the returned object is both different in structure and loses the name altogether:
foo <- function(x) { c(mean = base::mean(x)) }
str(aggregate(iris$Sepal.Length, by = list(iris$Species), FUN = foo))
## 'data.frame': 3 obs. of 2 variables:
## $ Group.1: Factor w/ 3 levels "setosa","versicolor",..: 1 2 3
## $ x : num 5.01 5.94 6.59
If $x were $mean, or if $x were a matrix with one column named “mean”, then I would not have had to write a custom aggregate.
For anyone interested, I’m doing this to create a function that can compute (multiple, group-wise) summary statistics easily, and return them in a tidy data frame with one row for each group. Naming depends on whether … arguments are named and on some optional arguments. Here is an example.
args(df_stats)
## function (formula, data, ..., drop = TRUE, fargs = list(), sep = "_",
## format = c("wide", "long"), groups = NULL, long_names = TRUE,
## nice_names = FALSE, na.action = "na.warn")
df_stats(Sepal.Length ~ Species, data = iris, mean, sd, R = range, Q = quantile)
## Species mean_Sepal.Length sd_Sepal.Length R_1 R_2 Q_0% Q_25% Q_50% Q_75% Q_100%
## 1 setosa 5.006 0.3524897 4.3 5.8 4.3 4.800 5.0 5.2 5.8
## 2 versicolor 5.936 0.5161711 4.9 7.0 4.9 5.600 5.9 6.3 7.0
## 3 virginica 6.588 0.6358796 4.9 7.9 4.9 6.225 6.5 6.9 7.9
As I’ve said, I solved my problem by creating a slightly modified version of aggregate(). But it made me wonder whether this is a bug or a feature in aggregate().
—rjp
On Mar 25, 2018, at 6:00 AM, r-devel-request at r-project.org<mailto:r-devel-request at r-project.org> wrote:
Date: Sat, 24 Mar 2018 20:08:33 +0000 (UTC)
From: lmo <lukemolson at yahoo.com<mailto:lukemolson at yahoo.com>>
To: "R-devel at r-project.org<mailto:R-devel at r-project.org>" <R-devel at r-project.org<mailto:R-devel at r-project.org>>
Subject: [Rd] aggregate() naming -- bug or feature
Message-ID: <1099191693.397761.1521922113450 at mail.yahoo.com<mailto:1099191693.397761.1521922113450 at mail.yahoo.com>>
Content-Type: text/plain; charset="utf-8"
Be aware that the object that aggregate returns with bar() is more complicated than you think.
str(aggregate(iris$Sepal.Length, by = list(iris$Species), FUN = bar))
'data.frame': 3 obs. of 2 variables:
$ Group.1: Factor w/ 3 levels "setosa","versicolor",..: 1 2 3
$ x : num [1:3, 1:2] 5.006 5.936 6.588 0.352 0.516 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr "mean" "sd"
So you get a two column data.frame whose second column is a matrix.
[[alternative HTML version deleted]]
More information about the R-devel
mailing list