[R] Parallelizing GBM
Mxkuhn
mxkuhn at gmail.com
Sun Mar 24 15:22:27 CET 2013
Yes, I think the second link is a test build of a parallelized cv loop within gbm().
On Mar 24, 2013, at 9:28 AM, "Lorenzo Isella" <lorenzo.isella at gmail.com> wrote:
> Thanks a lot for the quick answer.
> However, from what I see, the parallelization affects only the cross-validation part in the gbm interface (but it changes nothing when you call gbm.fit).
> Am I missing anything here?
> Is there any fundamental reason why gbm.fit cannot be parallelized?
>
> Lorenzo
>
>
>
> On Sun, 24 Mar 2013 12:45:39 +0100, Max Kuhn <mxkuhn at gmail.com> wrote:
>
>> See this:
>>
>> https://code.google.com/p/gradientboostedmodels/issues/detail?id=3
>>
>>
>> and this:
>>
>> https://code.google.com/p/gradientboostedmodels/source/browse/?name=parallel
>>
>>
>>
>> Max
>>
>>
>> On Sun, Mar 24, 2013 at 7:31 AM, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
>>
>>> Dear All,
>>>
>>> I am far from being a guru about parallel programming.
>>>
>>> Most of the time, I rely or randomForest for data mining large datasets.
>>>
>>> I would like to give a try also to the gradient boosted methods in GBM, but I have a need for parallelization.
>>>
>>> I normally rely on gbm.fit for speed reasons, and I usually call it this way
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> gbm_model <- gbm.fit(trainRF,prices_train,
>>>
>>> offset = NULL,
>>>
>>> misc = NULL,
>>>
>>> distribution = "multinomial",
>>>
>>> w = NULL,
>>>
>>> var.monotone = NULL,
>>>
>>> n.trees = 50,
>>>
>>> interaction.depth = 5,
>>>
>>> n.minobsinnode = 10,
>>>
>>> shrinkage = 0.001,
>>>
>>> bag.fraction = 0.5,
>>>
>>> nTrain = (n_train/2),
>>>
>>> keep.data = FALSE,
>>>
>>> verbose = TRUE,
>>>
>>> var.names = NULL,
>>>
>>> response.name = NULL)
>>>
>>>
>>>
>>>
>>>
>>> Does anybody know an easy way to parallelize the model (in this case it means simply having 4 cores on the same >>machine working on the problem)?
>>>
>>> Any suggestion is welcome.
>>>
>>> Cheers
>>>
>>>
>>>
>>> Lorenzo
>>>
>>>
>>>
>>> ______________________________________________
>>>
>>> R-help at r-project.org mailing list
>>>
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Max
More information about the R-help
mailing list