[R] more on apply on data frame
Gabor Grothendieck
ggrothendieck at myway.com
Sat Aug 21 14:32:58 CEST 2004
Laura Holt <lauraholt_983 <at> hotmail.com> writes:
>
> Hi R People:
>
> Several of you pointed out that using "tapply" on a data frame will work on
> the iris data frame.
>
> I'm still having a problem.
>
> The iris data frame has 150 rows, 5 variables. The first 4 are numeric,
> while the last is a factor, which has the Species names.
>
> I can use tapply for 1 variable at a time:
> >tapply(iris[,1],iris[,5],mean)
> setosa versicolor virginica
> 5.006 5.936 6.588
> >
> but if I try to use this for all of the first 4, I get an error:
> >tapply(iris[,1:4],iris[,5],mean)
> Error in tapply(iris[, 1:4], iris[, 5], mean) :
> arguments must have same length
This is a job for aggregate:
R> data(iris)
R> aggregate(iris[,1:4], list(Species = iris[,5]), mean)
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1 setosa 5.006 3.428 1.462 0.246
2 versicolor 5.936 2.770 4.260 1.326
3 virginica 6.588 2.974 5.552 2.026
The by command would also work using colMeans:
R> by(iris[,1:4], list(Species = iris[,5]), colMeans)
Species: setosa
Sepal.Length Sepal.Width Petal.Length Petal.Width
5.006 3.428 1.462 0.246
------------------------------------------------------------
Species: versicolor
Sepal.Length Sepal.Width Petal.Length Petal.Width
5.936 2.770 4.260 1.326
------------------------------------------------------------
Species: virginica
Sepal.Length Sepal.Width Petal.Length Petal.Width
6.588 2.974 5.552 2.026
More information about the R-help
mailing list