[R] Reproducibility issue in gbm (32 vs 64 bit)
Ridgeway, Greg
gregr at rand.org
Sat Feb 26 17:46:26 CET 2011
I have heard about this before happening on other platforms. Frankly I'm not positive how this happens. My best guess is that there's a tiny bit of numeric instability in the 9+ decimal place so that on a given iteration a one variable choice at random looks better than the other. Any other ideas?
Greg
----- Original Message -----
From: Joshua Wiley <jwiley.psych at gmail.com>
To: Axel Urbiz <axel.urbiz at gmail.com>
Cc: R-help at r-project.org <R-help at r-project.org>; Ridgeway, Greg
Sent: Fri Feb 25 22:16:02 2011
Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit)
Hi Axel,
I do not have a nice explanation why the results differ off the top of
my head. I can say I can replicate what you get on 32/64 (both
Windows 7) bit with the development version of R and gbm_1.6-3.1.
Here is an even simpler example that shows the difference:
gbmfit <- gbm(1:50 ~ I(50:1) + I(60:11), distribution = "gaussian")
summary(gbmfit)
I copied that package maintainer.
Cheers,
Josh
On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz <axel.urbiz at gmail.com> wrote:
> Dear List,
>
> The gbm package on Win 7 produces different results for the
> relative importance of input variables in R 32-bit relative to R 64-bit. Any
> idea why? Any idea which one is correct?
>
> Based on this example, it looks like the relative importance of 2 perfectly
> correlated predictors is "diluted" by half in 32-bit, whereas in 64-bit, one
> of these predictors gets all the importance and the other gets none. I found
> this interesting.
>
> ### Sample code
>
> library(gbm)
> set.seed(12345)
> xc=matrix(rnorm(100*20),100,20)
> y=sample(1:2,100,replace=TRUE)
> xc[,2] <- xc[,1]
> gbmfit <- gbm(y~xc[,1]+xc[,2] +xc[,3], distribution="gaussian")
> summary(gbmfit)
>
> ### Results on R 2.12.0 (32-bit)
>
> var rel.inf
> 1 xc[, 3] 49.76143
> 2 xc[, 1] 27.27432
> 3 xc[, 2] 22.96425
>>
> ### Results on R 2.12.0 (64-bit)
>> summary(gbmfit)
> var rel.inf
> 1 xc[, 1] 50.23857
> 2 xc[, 3] 49.76143
> 3 xc[, 2] 0.00000
>
> Thanks,
> Axel.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
__________________________________________________________________________
This email message is for the sole use of the intended recipient(s) and
may contain confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.
More information about the R-help
mailing list