[R] Why mean is not working in by?
Dimitri Liakhovitski
dimitri.liakhovitski at gmail.com
Wed Dec 9 00:18:54 CET 2015
Got it - thank you, everybody!
by splits it into data frames.
Lesson: use aggregate.
On Tue, Dec 8, 2015 at 6:17 PM, William Dunlap <wdunlap at tibco.com> wrote:
> by() calls FUN with a data.frame as the argument. summary(), sum(), etc.
> have methods that work on data.frames but sd() and mean() do not.
>
> aggregate() calls its FUN with each column of a data.frame as the argument.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Dec 8, 2015 at 3:08 PM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>>
>> Sorry, I omitted the first line:
>>
>> myvars <- c("Sepal.Length", "Sepal.Width")
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = summary)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = sum)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = var)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = max)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = min)
>>
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = sd)
>> by(data = iris[myvars], INDICES = iris["Species"], FUN = mean)
>>
>> The first lines are doing what I expected them to do: for each level
>> of the factor "Species" they gave me a summary, a sum, a variance, a
>> max, a min for each of the 2 variables in question (myvars).
>> I expected by to generate the sd and the mean for the 2 variables in
>> question for each level of "Species".
>>
>> On Tue, Dec 8, 2015 at 5:50 PM, Sarah Goslee <sarah.goslee at gmail.com>
>> wrote:
>> > Hi Dimitri,
>> >
>> > I changed this into a reproducible example (we don't know what myvars
>> > is). Assuming length(myvars) > 1, I'm not convinced that your first
>> > five lines "work" either: what do you expect?
>> >
>> > I get:
>> >
>> >> by(data = iris[, -5], INDICES = iris["Species"], FUN = min)
>> > Species: setosa
>> > [1] 0.1
>> > ------------------------------------------------------------------
>> > Species: versicolor
>> > [1] 1
>> > ------------------------------------------------------------------
>> > Species: virginica
>> > [1] 1.4
>> >
>> > But was expecting:
>> >
>> >> aggregate(iris[,-5], by=iris[,"Species", drop=FALSE], FUN=min)
>> > Species Sepal.Length Sepal.Width Petal.Length Petal.Width
>> > 1 setosa 4.3 2.3 1.0 0.1
>> > 2 versicolor 4.9 2.0 3.0 1.0
>> > 3 virginica 4.9 2.2 4.5 1.4
>> >
>> >
>> >
>> > aggregate(iris[,-5], by=iris[,"Species", drop=FALSE], FUN=sd)
>> > aggregate(iris[,-5], by=iris[,"Species", drop=FALSE], FUN=mean)
>> >
>> > provide the answers I would expect. If you want clearer advice, you
>> > need to provide an actually reproducible example, and tell us more
>> > about what you expect to get.
>> >
>> > Sarah
>> >
>> >
>> > On Tue, Dec 8, 2015 at 5:30 PM, Dimitri Liakhovitski
>> > <dimitri.liakhovitski at gmail.com> wrote:
>> >> Hello!
>> >> Could you please explain why the first 5 lines work but the last 2
>> >> lines don't?
>> >> Thank you!
>> >>
>> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = summary)
>> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = sum)
>> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = var)
>> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = max)
>> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = min)
>> >>
>> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = sd)
>> >> by(data = iris[myvars], INDICES = iris["Species"], FUN = mean)
>> >>
>> >> --
>> >> Dimitri Liakhovitski
>> >>
>>
>>
>>
>> --
>> Dimitri Liakhovitski
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Dimitri Liakhovitski
More information about the R-help
mailing list