[Rd] Proposal unary - operator for factors
William Dunlap
wdunlap at tibco.com
Thu Feb 4 01:43:26 CET 2010
> -----Original Message-----
> From: Duncan Murdoch [mailto:murdoch at stats.uwo.ca]
> Sent: Wednesday, February 03, 2010 4:17 PM
> To: William Dunlap
> Cc: Hadley Wickham; r-devel at r-project.org
> Subject: Re: [Rd] Proposal unary - operator for factors
>
> On 03/02/2010 6:49 PM, William Dunlap wrote:
> >> -----Original Message-----
> >> From: h.wickham at gmail.com [mailto:h.wickham at gmail.com] On
> >> Behalf Of Hadley Wickham
> >> Sent: Wednesday, February 03, 2010 3:38 PM
> >> To: William Dunlap
> >> Cc: r-devel at r-project.org
> >> Subject: Re: [Rd] Proposal unary - operator for factors
> >>
> >>> It wouldn't make sense in the context of
> >>> vector[-factor]
> >> True, but that doesn't work currently so you wouldn't lose
> anything.
> >> However, it would make a certain class of problem that
> used to throw
> >> errors become silent.
> >>
> >>> Wouldn't it be better to allow order's decreasing argument
> >>> to be a vector with one element per ... argument? That
> >>> would work for numbers, factors, dates, and anything
> >>> else. Currently order silently ignores decreasing[2] and
> >>> beyond.
> >> The problem is you might want to do something like
> order(a, -b, c, -d)
> >
> > Currently, for numeric a you can do either
> > order(-a)
> > or
> > order(a, decreasing=FALSE)
> > For nonnumeric types like POSIXct and factors only
> > the latter works.
> >
> > Under my proposal your
> > order(a, -b, c, d)
> > would be
> > order(a, b, c, d, decreasing=c(FALSE,TRUE,FALSE,TRUE))
> > and it would work for any ordably class without modifications
> > to any classes.
>
> Why not use
>
> order(a, -xtfrm(b), c, -xtfrm(d))
>
> ??
You could, if you can remember it. I have been annoyed
that decreasing= was in order() but not as useful as it
could be since it is not vectorized. The same goes for
na.last, although that seems less useful to me.
Here is a version of order (based on the
algorithm using in S+'s order) that
vectorizes the na.last and decreasing
arguments. It calls the existing order
function to implement decreasing=TRUE/FALSE
and na.last=TRUE/FALSE for a single argument
but order itself could be mofified in this
way.
new.order <- function (..., na.last = TRUE, decreasing = FALSE)
{
vectors <- list(...)
nVectors <- length(vectors)
stopifnot(nVectors > 0)
na.last <- rep(na.last, length = nVectors)
decreasing <- rep(decreasing, length = nVectors)
keys <- seq_len(length(vectors[[1]]))
for (i in nVectors:1) {
v <- vectors[[i]]
if (length(v) < length(keys))
v <- rep(v, length = length(keys))
keys <- keys[order(v[keys], na.last = na.last[i], decreasing =
decreasing[i])]
}
keys
}
With the following dataset
data <- data.frame(
ct = as.POSIXct(c("2009-01-01", "2010-02-03",
"2010-02-28"))[c(2,2,2,3,3,1)],
dt = as.Date(c("2009-01-01", "2010-02-03",
"2010-02-28"))[c(3,2,2,2,3,1)],
fac = factor(c("Small","Medium","Large"),
levels=c("Small","Medium","Large"))[c(1,3,2,3,3,1)],
n = c(11,12,12,11,12,12))
> data
ct dt fac n
1 2010-02-03 2010-02-28 Small 11
2 2010-02-03 2010-02-03 Large 12
3 2010-02-03 2010-02-03 Medium 12
4 2010-02-28 2010-02-03 Large 11
5 2010-02-28 2010-02-28 Large 12
6 2009-01-01 2009-01-01 Small 12
> data.frame(lapply(data,rank))
ct dt fac n
1 3.0 5.5 1.5 1.5
2 3.0 3.0 5.0 4.5
3 3.0 3.0 3.0 4.5
4 5.5 3.0 5.0 1.5
5 5.5 5.5 5.0 4.5
6 1.0 1.0 1.5 4.5
we get (where my demos use rank because I could remember
the name xtfrm):
> with(data, identical(order(ct,dt), new.order(ct,dt)))
[1] TRUE
> with(data, identical(order(fac,-n),
new.order(fac,n,decreasing=c(FALSE,TRUE))))
[1] TRUE
> with(data, identical(order(ct,-rank(dt)),
new.order(ct,dt,decreasing=c(FALSE,TRUE))))
[1] TRUE
> with(data, identical(order(ct,-rank(fac)),
new.order(ct,fac,decreasing=c(FALSE,TRUE))))
[1] TRUE
> with(data, identical(order(n,-rank(fac)),
new.order(n,fac,decreasing=c(FALSE,TRUE))))
[1] TRUE
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
>
> Duncan Murdoch
>
More information about the R-devel
mailing list