[R] Reproducibility issue in gbm (32 vs 64 bit)
Joshua Wiley
jwiley.psych at gmail.com
Sat Feb 26 07:16:02 CET 2011
Hi Axel,
I do not have a nice explanation why the results differ off the top of
my head. I can say I can replicate what you get on 32/64 (both
Windows 7) bit with the development version of R and gbm_1.6-3.1.
Here is an even simpler example that shows the difference:
gbmfit <- gbm(1:50 ~ I(50:1) + I(60:11), distribution = "gaussian")
summary(gbmfit)
I copied that package maintainer.
Cheers,
Josh
On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz <axel.urbiz at gmail.com> wrote:
> Dear List,
>
> The gbm package on Win 7 produces different results for the
> relative importance of input variables in R 32-bit relative to R 64-bit. Any
> idea why? Any idea which one is correct?
>
> Based on this example, it looks like the relative importance of 2 perfectly
> correlated predictors is "diluted" by half in 32-bit, whereas in 64-bit, one
> of these predictors gets all the importance and the other gets none. I found
> this interesting.
>
> ### Sample code
>
> library(gbm)
> set.seed(12345)
> xc=matrix(rnorm(100*20),100,20)
> y=sample(1:2,100,replace=TRUE)
> xc[,2] <- xc[,1]
> gbmfit <- gbm(y~xc[,1]+xc[,2] +xc[,3], distribution="gaussian")
> summary(gbmfit)
>
> ### Results on R 2.12.0 (32-bit)
>
> var rel.inf
> 1 xc[, 3] 49.76143
> 2 xc[, 1] 27.27432
> 3 xc[, 2] 22.96425
>>
> ### Results on R 2.12.0 (64-bit)
>> summary(gbmfit)
> var rel.inf
> 1 xc[, 1] 50.23857
> 2 xc[, 3] 49.76143
> 3 xc[, 2] 0.00000
>
> Thanks,
> Axel.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
More information about the R-help
mailing list