[R] recursive relevel
baptiste auguie
ba208 at exeter.ac.uk
Fri Jan 9 16:49:51 CET 2009
Thanks Thierry,
A quick test shows almost equivalent timing with the modification of
relevel() suggested earlier:
> relevel <-
> function (x, ref, ...)
> {
> lev <- levels(x)
> if (is.character(ref))
> ref <- match(ref, lev)
> if (any(is.na(ref)))
> stop("'ref' must be an existing level")
> nlev <- length(lev)
> if (any(ref < 1 | ref > nlev))
> stop(gettextf("ref = %d must be in 1:%d", ref, nlev),
> domain = NA)
> factor(x, levels = lev[c(ref, seq_along(lev)[-ref])])
> }
> > system.time(relevel(y, c("D", "B")))
> user system elapsed
> 5.972 0.258 6.395
> >
> > system.time(order.factor3(y, c("D", "B")))
> user system elapsed
> 5.962 0.274 6.459
It's always good to learn other options, though.
Thanks,
baptiste
On 9 Jan 2009, at 15:50, ONKELINX, Thierry wrote:
> Dear Baptiste,
>
> You can avoid the recursive stuff. And it will run about twice as
> fast.
>
>> order.factor <- function (x, ref)
> + {
> + last.index <- length(ref) # convenience for matlab's end keyword
> + if(last.index == 1) return(relevel(x, ref)) # end case, normal case
> + my.new.list <- list(x=relevel(x, ref[last.index]), ref=ref[-
> last.index])
> + return(do.call(order.factor, my.new.list)) # recursive call
> + }
>>
>> order.factor2 <- function(x, ref){
> + factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in% ref])))
> + }
>> order.factor3 <- function(x, ref){
> + factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in%
> ref])), labels = c(ref, sort(levels(x)[!levels(x) %in% ref])))
> + }
>>
>> x <- factor(sample(LETTERS[1:5], 10000000, replace = TRUE))
>> y <- factor(sample(LETTERS[1:20], 10000000, replace = TRUE))
>> system.time(order.factor(x, c("D", "B")))
> user system elapsed
> 5.69 0.38 6.09
>> system.time(order.factor2(x, c("D", "B")))
> user system elapsed
> 3.90 0.20 4.12
>> system.time(order.factor3(x, c("D", "B")))
> user system elapsed
> 3.26 0.19 3.46
>> system.time(order.factor(y, c("D", "B")))
> user system elapsed
> 17.43 0.39 17.84
>> system.time(order.factor3(y, c("D", "B")))
> user system elapsed
> 8.25 0.17 8.46
>
>
> HTH,
>
> Thierry
>
>
> ----------------------------------------------------------------------------
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for
> Nature and Forest
> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
> methodology and quality assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no
> more than asking him to perform a post-mortem examination: he may be
> able to say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does
> not ensure that a reasonable answer can be extracted from a given
> body of data.
> ~ John Tukey
>
> -----Oorspronkelijk bericht-----
> Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] Namens baptiste auguie
> Verzonden: vrijdag 9 januari 2009 15:11
> Aan: R R-help
> Onderwerp: [R] recursive relevel
>
> Dear list,
>
> I'm having second thoughts after solving a very trivial problem: I
> want to extend the relevel() function to reorder an arbitrary number
> of levels of a factor in one go. I could not find a trivial way of
> using the code obtained by getS3method("relevel","factor"). Instead, I
> thought of solving the problem in a recursive manner (possibly after
> reading Paul Graham essays on Lisp too recently). Here is my attempt :
>
>>
>> order.factor <- function (x, ref)
>> {
>>
>> last.index <- length(ref) # convenience for matlab's end keyword
>> if(last.index == 1) return(relevel(x, ref)) # end case, normal
>> case
>> of relevel
>> my.new.list <- list(x=relevel(x, ref[last.index]), # creating a
>> list with updated parameters,
>> # going
>> through the list in reverse order
>> ref=ref[-
>> last.index]) # chop the vector from its last level
>> return(do.call(order.factor, my.new.list)) # recursive call
>> }
>>
>> ff <- factor(c("a", "b", "c", "d"))
>> ff
>> relevel(ff, levels(ff)[1])
>> relevel(ff, levels(ff)[2]) # that's the usual case: you want to put
>> a level first
>>
>> order.factor(x=ff, ref=c("a", "b"))
>> order.factor(x=ff, ref=c("c"))
>> order.factor(x=ff, ref=c("c", "d")) # that's my wish: put c and d in
>> that order as the first two levels
>>
>
>
> I'm hoping this can be improved in several aspects:
>
> - there is probably already a better function I missed or overlooked
> (I'd still be curious about the following points, though)
>
> - after reading a few threads, it appears that some recursive
> functions are fragile in some sense, and I'm not sure what this means
> in practice. (Should I use Recall, somehow?)
>
> - it's probably quite slow for large data.frames
>
> - I could not think of a good name, this one might clash with some S3
> method perhaps?
>
> - any other thoughts welcome!
>
>
> Best wishes,
>
> Baptiste
> _____________________________
>
> Baptiste Auguié
>
> School of Physics
> University of Exeter
> Stocker Road,
> Exeter, Devon,
> EX4 4QL, UK
>
> Phone: +44 1392 264187
>
> http://newton.ex.ac.uk/research/emag
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> Dit bericht en eventuele bijlagen geven enkel de visie van de
> schrijver weer
> en binden het INBO onder geen enkel beding, zolang dit bericht niet
> bevestigd is
> door een geldig ondertekend document. The views expressed in this
> message
> and any annex are purely those of the writer and may not be regarded
> as stating
> an official position of INBO, as long as the message is not
> confirmed by a duly
> signed document.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
_____________________________
Baptiste Auguié
School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK
Phone: +44 1392 264187
http://newton.ex.ac.uk/research/emag
More information about the R-help
mailing list