[R] Condition to factor (easy to remember)
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Wed Sep 30 21:54:40 CEST 2009
Douglas Bates wrote:
> On Wed, Sep 30, 2009 at 2:42 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
>> On Wed, Sep 30, 2009 at 2:43 AM, Dieter Menne
>> <dieter.menne at menne-biomed.de> wrote:
>>
>>> Dear List,
>>> creating factors in a given non-default orders is notoriously difficult to
>>> explain in a course. Students love the ifelse construct given below most,
>>> but I remember some comment from Martin Mächler (?) that ifelse should be
>>> banned from courses.
>>> Any better idea? Not necessarily short, easy to remember is important.
>>> Dieter
>>> data = c(1,7,10,50,70)
>>> levs = c("Pre","Post")
>>>
>>> # Typical C-Programmer style
>>> factor(levs[as.integer(data >10)+1], levels=levs)
>>>
>>> # Easiest to understand
>>> factor(ifelse(data <=10, levs[1], levs[2]), levels=levs)
>> Why not
>>
>>> factor(data > 10, labels = c("Pre", "Post"))
>> [1] Pre Pre Pre Post Post
>> Levels: Pre Post
>>
>> All you have to remember is that FALSE comes before TRUE.
>
> And besides, Frank Harrell will soon be weighing in to tell you why
> you shouldn't dichotomize in the first place.
And someone might also remind you that it is safest to include
levels=c(FALSE,TRUE), just in case the condition is always TRUE. (Terry
Thernau has the scars from the implementation of Surv()...)
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list