[Rd] quantile(), IQR() and median() for factors
Simone Giannerini
sgiannerini at gmail.com
Fri Mar 6 00:48:55 CET 2009
Dear all,
from the help page of quantile:
"x numeric vectors whose sample quantiles are wanted. Missing
values are ignored."
from the help page of IQR:
"x a numeric vector."
as a matter of facts it seems that both quantile() and IQR() do not
check for the presence of a numeric input.
See the following:
set.seed(11)
x <- rbinom(n=11,size=2,prob=.5)
x <- factor(x,ordered=TRUE)
x
[1] 1 0 1 0 0 2 0 1 2 0 0
Levels: 0 < 1 < 2
> quantile(x)
0% 25% 50% 75% 100%
0 <NA> 0 <NA> 2
Levels: 0 < 1 < 2
Warning messages:
1: In Ops.ordered((1 - h), qs[i]) :
'*' is not meaningful for ordered factors
2: In Ops.ordered(h, x[hi[i]]) : '*' is not meaningful for ordered factors
> IQR(x)
[1] 1
whereas median has the check:
> median(x)
Error in median.default(x) : need numeric data
I also take the opportunity to ask your comments on the following
related subject:
In my opinion it would be convenient that median() and the like
(quantile(), IQR()) be implemented for ordered factors for which in
fact
they can be well defined. For instance, in this way functions like
apply(x,FUN=median,...) could be used without the need of further
processing for
data frames that contain both numeric variables and ordered factors.
If on the one hand, to my limited knowledge, in English introductory
statistics
textbooks the fact that the median is well defined for ordered
categorical variables is only mentioned marginally,
on the other hand, in the Italian Statistics literature this is often
discussed in detail and this could mislead students and practitioners
that might
expect median() to work for ordered factors.
In this message
https://stat.ethz.ch/pipermail/r-help/2003-November/042684.html
Martin Maechler considers the possibility of doing such a job by
allowing for extra arguments "low" and "high" as it is done for mad().
I am willing to give a contribution if requested, and comments are welcome.
Thank you for the attention,
kind regards,
Simone
> R.version
_
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 8.1
year 2008
month 12
day 22
svn rev 47281
language R
version.string R version 2.8.1 (2008-12-22)
LC_COLLATE=Italian_Italy.1252;LC_CTYPE=Italian_Italy.1252;LC_MONETARY=Italian_Italy.1252;LC_NUMERIC=C;LC_TIME=Italian_Italy.1252
--
______________________________________________________
Simone Giannerini
Dipartimento di Scienze Statistiche "Paolo Fortunati"
Universita' di Bologna
Via delle belle arti 41 - 40126 Bologna, ITALY
Tel: +39 051 2098262 Fax: +39 051 232153
http://www2.stat.unibo.it/giannerini/
More information about the R-devel
mailing list