[R] confusion on levels() function, and how to assign a wanted order to factor levels, intentionally?
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Tue Jun 16 11:42:48 CEST 2009
Mark Difford wrote:
> Hi Mao,
>
>>> I am confused. And, I want to know how to assign a wanted order to factor
>>> levels, intentionally?
>
> You want ?relevel. Although the documentation leads one to think that it can
> only be used to set a reference level, with the other levels being moved
> down, presently it can in fact be used to set any order you wish. For a
> factor with just a few levels you could simply use an index into the default
> order.
>
> ##
> new_d <- d
> c(5,1,6:10,2:4)
> new_d$population <- relevel(d$population,
> levels(d$population)[c(5,1,6:10,2:4)])
>
> Ignore the warning. Note that relevel can also be used "on-the-fly," so
> without permanently changing level-order.
Now that's a dangerous strategy! You're relying on undocumented
behaviour and ignoring a warning message to boot. If someone implements
a check that ref is a scalar as assumed, you're shot.
Better to have a look at why stats:::relevel.factor currently works and
use the same mechanism:
lev <- levels(x)
if (is.character(ref))
ref <- match(ref, lev)
if (is.na(ref))
stop("'ref' must be an existing level")
nlev <- length(lev)
if (ref < 1 || ref > nlev)
stop(gettextf("ref = %d must be in 1:%d", ref, nlev),
domain = NA)
factor(x, levels = lev[c(ref, seq_along(lev)[-ref])])
and if you assume an integer reordering in ref, this reduces to
lev <- levels(x)
factor(x, levels = lev[ref])
and if ref is a character vector, plain
factor(x, levels=ref)
should do.
(Or, you can go "full monty" and retain all the checks an balances, just
cure the warning using
if (any(is.na(ref))
stop("'ref' must contain existing levels")
...
if (any(ref < 1 | ref > nlev))
Maybe also check !any(duplicated(ref)) for good measure
)
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list