[R] Why mean is not working in by?

William Dunlap wdunlap at tibco.com
Wed Dec 9 00:17:57 CET 2015


by() calls FUN with a data.frame as the argument.  summary(), sum(), etc.
have methods that work on data.frames but sd() and mean() do not.

aggregate() calls its FUN with each column of a data.frame as the argument.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Dec 8, 2015 at 3:08 PM, Dimitri Liakhovitski <
dimitri.liakhovitski at gmail.com> wrote:

> Sorry, I omitted the first line:
>
> myvars <- c("Sepal.Length", "Sepal.Width")
> by(data = iris[myvars], INDICES = iris["Species"], FUN = summary)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = sum)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = var)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = max)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = min)
>
> by(data = iris[myvars], INDICES = iris["Species"], FUN = sd)
> by(data = iris[myvars], INDICES = iris["Species"], FUN = mean)
>
> The first lines are doing what I expected them to do: for each level
> of the factor "Species" they gave me a summary, a sum, a variance, a
> max, a min for each of the 2 variables in question (myvars).
> I expected by to generate the sd and the mean for the 2 variables in
> question for each level of "Species".
>
> On Tue, Dec 8, 2015 at 5:50 PM, Sarah Goslee <sarah.goslee at gmail.com>
> wrote:
> > Hi Dimitri,
> >
> > I changed this into a reproducible example (we don't know what myvars
> > is). Assuming length(myvars) > 1, I'm not convinced that your first
> > five lines "work" either: what do you expect?
> >
> > I get:
> >
> >> by(data = iris[, -5], INDICES = iris["Species"], FUN = min)
> > Species: setosa
> > [1] 0.1
> > ------------------------------------------------------------------
> > Species: versicolor
> > [1] 1
> > ------------------------------------------------------------------
> > Species: virginica
> > [1] 1.4
> >
> > But was expecting:
> >
> >> aggregate(iris[,-5], by=iris[,"Species", drop=FALSE], FUN=min)
> >      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
> > 1     setosa          4.3         2.3          1.0         0.1
> > 2 versicolor          4.9         2.0          3.0         1.0
> > 3  virginica          4.9         2.2          4.5         1.4
> >
> >
> >
> > aggregate(iris[,-5], by=iris[,"Species", drop=FALSE], FUN=sd)
> > aggregate(iris[,-5], by=iris[,"Species", drop=FALSE], FUN=mean)
> >
> > provide the answers I would expect. If you want clearer advice, you
> > need to provide an actually reproducible example, and tell us more
> > about what you expect to get.
> >
> > Sarah
> >
> >
> > On Tue, Dec 8, 2015 at 5:30 PM, Dimitri Liakhovitski
> > <dimitri.liakhovitski at gmail.com> wrote:
> >> Hello!
> >> Could you please explain why the first 5 lines work but the last 2
> lines don't?
> >> Thank you!
> >>
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = summary)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = sum)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = var)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = max)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = min)
> >>
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = sd)
> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = mean)
> >>
> >> --
> >> Dimitri Liakhovitski
> >>
>
>
>
> --
> Dimitri Liakhovitski
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list