[R] Bug in by() function which works for some FUN argument and does not work for others
Akhilesh Singh
akhileshsingh.igkv at gmail.com
Fri Apr 15 10:16:54 CEST 2016
Dear All,
Thanks for your help. However, I would like to draw your attention to the
following:
Actually, I was replicating the Example 2.3, using the dataset
"brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55,
of a famous book "R by Example" written by "Jim Albert and Maria Rizzo"
published in Springers (2012) in a Use R! Series. The output of the by()
function printed in the book is being reproduced below for information to
all:
> by(data=brain[, -1], INDICES=brain$Gender, FUN=mean, na.rm=TRUE)
brain$Gender: Female
FSIQ VIQ PIQ Weight Height MRI_Count
111.900 109.450 110.450 137.200 65.765 862654.600
------------------------------------------------------------
brain$Gender: Male
FSIQ VIQ PIQ Weight Height MRI_Count
115.00000 115.25000 111.60000 166.44444 71.43158 954855.40000
I do not know how could the writers of the book have produced the above
results by by() function. But, when I could not reproduce these results,
then I thought that probably, this could possibly be due to some missing
values NA's in Weight and Height variables. Then I tried the above code for
the "mtcars" dataset for INDICES=mtcars$am. When I found the same results
here too, then I reported the case in "r-help at R-project.org".
With best regards,
Dr. A.K. Singh
Head, Department of Agril. Statistics
Indira Gandhi Krishi Vishwavidyalaya, Raipur
Chhattisgarh, India, PIN-492012
Mobile: +919752620740
Email: akhileshsingh.igkv at gmail.com
On Fri, Apr 15, 2016 at 3:06 AM, Adrian Dușa <dusa.adrian at unibuc.ro> wrote:
> I think you are not using the best function for what your intentions are.
> Try:
>
> > by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans)
> : 0
> mpg cyl disp hp drat wt
> qsec vs
> 17.1473684 6.9473684 290.3789474 160.2631579 3.2863158 3.7688947
> 18.1831579 0.3684211
> am gear carb
> 0.0000000 3.2105263 2.7368421
>
> ---------------------------------------------------------------------------
> : 1
> mpg cyl disp hp drat wt
> qsec vs
> 24.3923077 5.0769231 143.5307692 126.8461538 4.0500000 2.4110000
> 17.3600000 0.5384615
> am gear carb
> 1.0000000 4.3846154 2.9230769
>
> See the difference between colMeans() and mean() in their respective help
> files.
> Hth,
> Adrian
>
> On Thu, Apr 14, 2016 at 11:14 PM, Akhilesh Singh <
> akhileshsingh.igkv at gmail.com> wrote:
>
>> Dear Sirs,
>>
>> I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
>> Chhattisgarh, India.
>>
>> While taking classes, I found the *by() *function producing following
>> error
>>
>> when I use FUN=mean or median and some other functions, however,
>> FUN=summary works.
>>
>> Given below is the output of the example I used on a built-in dataset
>> "mtcars", along with error message reproduced herewith:
>>
>> > by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)
>> : 0
>> [1] NA
>> ------------------------------------------------------------
>> : 1
>> [1] NA
>> Warning messages:
>> 1: In mean.default(data[x, , drop = FALSE], ...) :
>> argument is not numeric or logical: returning NA
>> 2: In mean.default(data[x, , drop = FALSE], ...) :
>> argument is not numeric or logical: returning NA
>>
>> However, the same by() function works for FUN=summary, given below is the
>> output:
>>
>> > by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)
>> : 0
>> mpg cyl disp hp
>> Min. :10.40 Min. :4.000 Min. :120.1 Min. : 62.0
>> 1st Qu.:14.95 1st Qu.:6.000 1st Qu.:196.3 1st Qu.:116.5
>> Median :17.30 Median :8.000 Median :275.8 Median :175.0
>> Mean :17.15 Mean :6.947 Mean :290.4 Mean :160.3
>> 3rd Qu.:19.20 3rd Qu.:8.000 3rd Qu.:360.0 3rd Qu.:192.5
>> Max. :24.40 Max. :8.000 Max. :472.0 Max. :245.0
>> drat wt qsec vs am
>>
>> Min. :2.760 Min. :2.465 Min. :15.41 Min. :0.0000 Min.
>> :0
>>
>> 1st Qu.:3.070 1st Qu.:3.438 1st Qu.:17.18 1st Qu.:0.0000 1st
>> Qu.:0
>>
>> Median :3.150 Median :3.520 Median :17.82 Median :0.0000 Median
>> :0
>>
>> Mean :3.286 Mean :3.769 Mean :18.18 Mean :0.3684 Mean
>> :0
>>
>> 3rd Qu.:3.695 3rd Qu.:3.842 3rd Qu.:19.17 3rd Qu.:1.0000 3rd
>> Qu.:0
>>
>> Max. :3.920 Max. :5.424 Max. :22.90 Max. :1.0000 Max.
>> :0
>>
>> gear carb
>> Min. :3.000 Min. :1.000
>> 1st Qu.:3.000 1st Qu.:2.000
>> Median :3.000 Median :3.000
>> Mean :3.211 Mean :2.737
>> 3rd Qu.:3.000 3rd Qu.:4.000
>> Max. :4.000 Max. :4.000
>> ------------------------------------------------------------
>> : 1
>> mpg cyl disp hp drat
>>
>> Min. :15.00 Min. :4.000 Min. : 71.1 Min. : 52.0 Min.
>> :3.54
>> 1st Qu.:21.00 1st Qu.:4.000 1st Qu.: 79.0 1st Qu.: 66.0 1st
>> Qu.:3.85
>> Median :22.80 Median :4.000 Median :120.3 Median :109.0 Median
>> :4.08
>> Mean :24.39 Mean :5.077 Mean :143.5 Mean :126.8 Mean
>> :4.05
>> 3rd Qu.:30.40 3rd Qu.:6.000 3rd Qu.:160.0 3rd Qu.:113.0 3rd
>> Qu.:4.22
>> Max. :33.90 Max. :8.000 Max. :351.0 Max. :335.0 Max.
>> :4.93
>> wt qsec vs am gear
>>
>> Min. :1.513 Min. :14.50 Min. :0.0000 Min. :1 Min.
>> :4.000
>>
>> 1st Qu.:1.935 1st Qu.:16.46 1st Qu.:0.0000 1st Qu.:1 1st
>> Qu.:4.000
>>
>> Median :2.320 Median :17.02 Median :1.0000 Median :1 Median
>> :4.000
>>
>> Mean :2.411 Mean :17.36 Mean :0.5385 Mean :1 Mean
>> :4.385
>>
>> 3rd Qu.:2.780 3rd Qu.:18.61 3rd Qu.:1.0000 3rd Qu.:1 3rd
>> Qu.:5.000
>>
>> Max. :3.570 Max. :19.90 Max. :1.0000 Max. :1 Max.
>> :5.000
>>
>> carb
>> Min. :1.000
>> 1st Qu.:1.000
>> Median :2.000
>> Mean :2.923
>> 3rd Qu.:4.000
>> Max. :8.000
>> >
>>
>> I am using the latest version of *R-3.2.4 on Windows*, however, this error
>> is being generated in the previous version too,
>>
>> Hope this reporting will get serious attention in debugging.
>>
>> With best regards,
>>
>> Dr. A.K. Singh
>> Head, Department of Agril. Statistics
>> Indira Gandhi Krishi Vishwavidyalaya, Raipur
>> Chhattisgarh, India, PIN-492012
>> Mobile: +919752620740
>> Email: akhileshsingh.igkv at gmail.com
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Adrian Dusa
> University of Bucharest
> Romanian Social Data Archive
> Soseaua Panduri nr.90
> 050663 Bucharest sector 5
> Romania
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list