[Rd] R 2.12.1 Windows 32bit and 64bit - are numerical differences expected?
Duncan Murdoch
murdoch.duncan at gmail.com
Thu Feb 10 13:39:30 CET 2011
On 11-02-10 6:37 AM, Graham Williams wrote:
> Should one expect minor numerical differences between 64bit and 32bit R on
> Windows? Hunting around the lists I've not been able to find a definitive
> answer yet. Seems plausible using different precision arithmetic, but waned
> to confirm from those who might know for sure.
I think our goal is that those results should be as close as possible.
R uses the same precision in both 32 bit and 64 bit; the differences are
all in pointers, not floating point values.
However, the two versions use different run-time libraries, and it is
possible that there are precision differences coming from there. I
think we'd be interested in knowing what they are even if they are
beyond our control, so I would appreciate it if you could track down
where the difference arises.
Duncan Murdoch
>
> BACKGROUND
>
> A colleague was trying to replicate some modelling results (from a soon to
> be published book) using rpart, ada, and randomForest, for example. My 64bit
> Linux and 64bit Windows 7 always agree (so far), but not their 32bit
> Windows. I've distilled it to a few simple lines of code to replicate the
> differences (but had to stay with the weather dataset from rattle since
> could not replicate on standard datasets yet).
>
> library(rpart)
> library(rattle)
> set.seed(41)
> model<- rpart(RainTomorrow ~ ., data=weather[-c(1, 2,
> 23)], control=rpart.control(minbucket=0))
> print(model$cptable)
>
> Final row on 32bit: 9 0.01000000 23 0.1515152 1.1060606 0.1158273
> Final row on 64bit: 9 0.01000000 23 0.1515152 1.0909091 0.1152273
>
> Pretty minor, but different. I've not found any seed other than 41 (only
> tried a few) that results in a difference.
>
> library(ada) # using rpart underneath
> set.seed(41)
> model<- ada(RainTomorrow ~ ., data=weather[-c(1, 2, 23)])
> print(model)
>
> On 32bit: Train Error: 0.057
> On 64bit: Train Error: 0.055
>
> Changing the seed to 42, for example, brings them into sync.
>
> library(randomForest)
> set.seed(41)
> model<- randomForest(RainTomorrow ~ ., data=weather[-c(1, 2, 23)],
> importance=TRUE, na.action=na.roughfix)
> print(model)
>
> On 32bit: OOB estimate of error rate: 12.84%
> On 64bit: OOB estimate of error rate: 11.75%
>
>
>> sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252
> [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
> [5] LC_TIME=English_Australia.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] randomForest_4.5-36 pmml_1.2.27 XML_3.2-0.2
> [4] colorspace_1.0-1 RGtk2_2.20.3 ada_2.0-2
> [7] rattle_2.6.2 rpart_3.1-47
>
> loaded via a namespace (and not attached):
> [1] tools_2.12.1
>
>> sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
> ...
>
>
> Thanks,
> Graham
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list