[R] confusion on levels() function, and how to assign a wanted order to factor levels, intentionally?

baptiste auguie ba208 at exeter.ac.uk
Tue Jun 16 11:58:47 CEST 2009


Commenting on this, is there a strong argument against modifying 
relevel() to reorder more than one level at a time?

I started a topic a while back ("recursive relevel", 
https://stat.ethz.ch/pipermail/r-help/2009-January/184397.html) and I've 
happily used the proposed change since then by overloading 
stats:::relevel.factor.

I'm sure there should be additional checking (for which I'm not versed 
enough in programming to propose a real patch) but the enhanced 
functionality seems very desirable (especially for plotting).

Best regards,

baptiste




Peter Dalgaard wrote:
> Mark Difford wrote:
>   
>> Hi Mao,
>>
>>     
>>>> I am confused. And, I want to know how to assign a wanted order to factor 
>>>> levels, intentionally?
>>>>         
>> You want ?relevel. Although the documentation leads one to think that it can
>> only be used to set a reference level, with the other levels being moved
>> down, presently it can in fact be used to set any order you wish. For a
>> factor with just a few levels you could simply use an index into the default
>> order.
>>
>> ##
>> new_d <- d
>> c(5,1,6:10,2:4)
>> new_d$population <- relevel(d$population,
>> levels(d$population)[c(5,1,6:10,2:4)])
>>
>> Ignore the warning. Note that relevel can also be used "on-the-fly," so
>> without permanently changing level-order.
>>     
>
> Now that's a dangerous strategy! You're relying on undocumented
> behaviour and ignoring a warning message to boot. If someone implements
> a check that ref is a scalar as assumed, you're shot.
>
> Better to have a look at why stats:::relevel.factor currently works and
> use the same mechanism:
>
>     lev <- levels(x)
>     if (is.character(ref))
>         ref <- match(ref, lev)
>     if (is.na(ref))
>         stop("'ref' must be an existing level")
>     nlev <- length(lev)
>     if (ref < 1 || ref > nlev)
>         stop(gettextf("ref = %d must be in 1:%d", ref, nlev),
>             domain = NA)
>     factor(x, levels = lev[c(ref, seq_along(lev)[-ref])])
>
> and if you assume an integer reordering in ref, this reduces to
>
>     lev <- levels(x)
>     factor(x, levels = lev[ref])
>
> and if ref is a character vector, plain
>
>     factor(x, levels=ref)
>
> should do.
>
> (Or, you can go "full monty" and retain all the checks an balances, just
> cure the warning using
>
> if (any(is.na(ref))
>         stop("'ref' must contain existing levels")
> ...
> if (any(ref < 1 | ref > nlev))
>
> Maybe also check !any(duplicated(ref)) for good measure
> )
>
>




More information about the R-help mailing list