Summary vs fivenum results for Q3
Ken Knoblauch
knoblauch at lyon.inserm.fr
Tue Oct 9 17:32:50 CEST 2007
Schaefer, Robert L. Dr. <schaefrl <at> muohio.edu> writes:
> I've just started using R and am still a neophyte, but I found the
following curious result. I'm using
the
> current version of R (2.5.1 (2007-06-27) ).
>
> Why are the results for the third quartile different in the output
from the summary and fivenum
commands?
> For the following data set
>
> 457 514 530 530 538 560 687 745 745
778 786 790 792 821
821 822 822
> 828 845 850 886 886 886 913 1050 1050
1065 1065 1065 1065
1090 1130
>
> Summary yields:
>
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 457.0 745.0 822.0 825.4 947.2 1130.0
>
> While fivenum yields:
>
> [1] 457.0 745.0 822.0 981.5 1130.0
>
> The third quartile is being correctly calculated in the
fivenum command and incorrectly in the
summary command.
>
> Bob
If you look in ?boxplot.stats, it says:
The two “hinges” are versions of the first and third quartile,
i.e., close to quantile(x, c(1,3)/4). The
hinges equal the quartiles for odd n
(where n <- length(x)) and differ for even n.
Where the quartiles
only equal observations for n %% 4 == 1 (n = 1 mod 4),
the hinges do so additionally for n %% 4 == 2
(n = 2 mod 4), and are in the middle of two observations
otherwise.
I got here by looking a summary.default and seeing that it
uses the quantile function
and then looking at fivenum to see that it did not.
Looking at the help for fivenum
led me to boxplot.stats where I was that it w
as not necessarily doing the same thing.
HTH
--
Ken Knoblauch
Inserm U846
Institut Cellule Souche et Cerveau
Département Neurosciences Intégratives
18 avenue du Doyen Lépine
69500 Bron
France
