[R] randomForest and ordered factors
Birgit Lemcke
birgit.lemcke at systbot.uzh.ch
Tue Apr 29 17:00:46 CEST 2008
Hello Andy,
thanks for your answer and sorry that I did not check the rfNews.
You are right, presently I am only looking at variable importance and
therefore I am very happy to hear that I won`t have problems using
ordered factors.
Since I just started to fiddle around with randomForest it might be
that I will have some more questions later - let`see.
But presently I am very grateful that you provide this package.
Greets
Birgit
Am 29.04.2008 um 16:29 schrieb Liaw, Andy:
> If you are using the latest version (4.5-25), you will see in rfNews
> () that that's the problem I need to fix. The package was able to
> handle ordered factors, but some more stringent checks for factor
> levels consistency introduced in 4.5-23 broke the support for
> ordered factors in prediction.
>
>> From the code you've shown, it looks like you are just growing the
>> forest to evaluate variable importance or other things, instead of
>> predicting other data (since you set keep.forest=FALSE). If
>> that's the case, you should be fine, as the problem only happens
>> when you try to call predict() with models that contain ordered
>> factors as predictors.
>
> (Ordered factors are basically treated as numerics in RF: trees
> only make use of ranks for numeric variables, so there's basically
> no difference between ordered factors and numeric variables as
> predictors.)
>
> Andy
>
> From: Birgit Lemcke
>>
>> Hello R-user!
>>
>> I am running R 2.7.0 on a Power Book (Tiger). (I am still R and
>> statistics beginner)
>>
>> I try to find the most important variables to divide my dataset as
>> given in a categorical variable.
>>
>> code:
>>
>> Test.rf4<-randomForest(Sex~.,na.action=na.roughfix, data=Subset4,
>> importance=TRUE, proximity=TRUE, ntree=10000, do.trace=1000,
>> keep.forest=FALSE)
>>
>> My dataset contains also ordered factors classified as such.
>> Is randomForest able to deal with it, does it change anything or is
>> there no difference in using factors or ordered factors?
>>
>> Many thanks in advance
>>
>> B.
>>
>> Birgit Lemcke
>> Institut für Systematische Botanik
>> Zollikerstrasse 107
>> CH-8008 Zürich
>> Switzerland
>> Ph: +41 (0)44 634 8351
>> birgit.lemcke at systbot.uzh.ch
>>
>> 175 Jahre UZH
>> «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
>> MNF-Jubiläumsevent für gross und klein.
>> 19. April 2008, 10.00 Uhr bis 02.00 Uhr
>> Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
>> Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> Notice: This e-mail message, together with any attachments,
> contains information of Merck & Co., Inc. (One Merck Drive,
> Whitehouse Station, New Jersey, USA 08889), and/or its affiliates
> (which may be known outside the United States as Merck Frosst,
> Merck Sharp & Dohme or MSD and in Japan, as Banyu - direct contact
> information for affiliates is available at http://www.merck.com/
> contact/contacts.html) that may be confidential, proprietary
> copyrighted and/or legally privileged. It is intended solely for
> the use of the individual or entity named on this message. If you
> are not the intended recipient, and have received this message in
> error, please notify us immediately by reply e-mail and then delete
> it from your system.
>
Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
birgit.lemcke at systbot.uzh.ch
175 Jahre UZH
«staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
MNF-Jubiläumsevent für gross und klein.
19. April 2008, 10.00 Uhr bis 02.00 Uhr
Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft
More information about the R-help
mailing list