[R] Decision Tree and Random Forrest

Achim Zeileis Achim.Zeileis at uibk.ac.at
Thu Apr 14 09:23:47 CEST 2016


On Thu, 14 Apr 2016, Michael Artz wrote:

> Ah yes I will have to use the predict function.  But the predict function
> will not get me there really.  If I can take the example that I have a
> model predicting whether or not I will play golf (this is the dependent
> value), and there are three independent variables Humidity(High, Medium,
> Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind (High,
> Low).  I would like rules like where any record that follows these rules
> (IF humidity = high AND pending_chores = None AND Wind = High THEN 77%
> there is probability that play_golf is YES).

Although I think that this toy example is not overly useful for practical 
illustrations we have included the standard dataset in the "partykit" 
package:

## data
data("WeatherPlay", package = "partykit")

> I was thinking that random forrest would weight the rules somehow on the 
> collection of trees and give a probability.  But if that doesnt make 
> sense, then can you just tell me how to get the decsion rules with one 
> tree and I will work from that.

Then you can learn one tree on this data, e.g., with rpart() or ctree():

## trees
library("rpart")
rp <- rpart(play ~ ., data = WeatherPlay,
   control = rpart.control(minsplit = 5))

library("partykit")
ct <- ctree(play ~ ., data = WeatherPlay,
   minsplit = 5, mincriterion = 0.1)

## visualize via partykit
pr <- as.party(rp)
plot(pr)
plot(ct)

And the partykit package also includes a function to generate a text 
representation of the rules although this is currently not exported:

partykit:::.list.rules.party(pr)
##                            "outlook %in% c(\"overcast\")"
##                                                         4
##  "outlook %in% c(\"sunny\", \"rainy\") & humidity < 82.5"
##                                                         5
## "outlook %in% c(\"sunny\", \"rainy\") & humidity >= 82.5"

partykit:::.list.rules.party(ct)
##                2                3
## "humidity <= 80"  "humidity > 80"

If you do not want a text representation but something else you can 
compute on, then look at the source code of partykit:::.list.rules.party() 
and try to adapt it to your needs.

> On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>
>> I think you are missing the point of random forests. But if you just
>> want to predict using the forest, there is a predict() method that you
>> can use. Other than that, I certainly don't understand what you mean.
>> Maybe someone else might.
>>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz <michaeleartz at gmail.com>
>> wrote:
>>> Ok is there a way to do  it with decision tree?  I just need to make the
>>> decision rules. Perhaps I can pick one of the trees used with Random
>>> Forrest.  I am somewhat familiar already with Random Forrest with
>> respective
>>> to bagging and feature sampling and getting the mode from the leaf nodes
>> and
>>> it being an ensemble technique of many trees.  I am just working from the
>>> perspective that I need decision rules, and I am working backward form
>> that,
>>> and I need to do it in R.
>>>
>>> On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter <bgunter.4567 at gmail.com>
>> wrote:
>>>>
>>>> Nope.
>>>>
>>>> Random forests are not decision trees -- they are ensembles (forests)
>>>> of trees. You need to go back and read up on them so you understand
>>>> how they work. The Hastie/Tibshirani/Friedman "The Elements of
>>>> Statistical Learning" has a nice explanation, but I'm sure there are
>>>> lots of good web resources, too.
>>>>
>>>> Cheers,
>>>> Bert
>>>>
>>>>
>>>> Bert Gunter
>>>>
>>>> "The trouble with having an open mind is that people keep coming along
>>>> and sticking things into it."
>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>
>>>>
>>>> On Wed, Apr 13, 2016 at 1:40 PM, Michael Artz <michaeleartz at gmail.com>
>>>> wrote:
>>>>> Hi I'm trying to get the top decision rules from a decision tree.
>>>>> Eventually I will like to do this with R and Random Forrest.  There
>> has
>>>>> to
>>>>> be a way to output the decsion rules of each leaf node in an easily
>>>>> readable way. I am looking at the randomforrest and rpart packages
>> and I
>>>>> dont see anything yet.
>>>>> Mike
>>>>>
>>>>>         [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list