[R] Difference between 32-bit and 64-bit version
Duncan Murdoch
murdoch.duncan at gmail.com
Thu Jun 4 11:53:25 CEST 2015
On 04/06/2015 3:59 AM, Thierry Onkelinx wrote:
> Dear Duncan,
>
> I had been thinking about FAQ 7.31. I tried to create a dummy dataset
> with the same structure to replicate the problem with the need of
> sending my dataset. However all of them gave identical() results between
> 32-bit and 64-bit. Note that coef()$fRow is a 1266 x 6 data.frame. Is it
> correct to infer that tiny difference between 32-bit and 64-bit are
> possible but have a low probability of occurring?
Differences are rare, but it's hard to assign a probability to them.
Duncan Murdoch
>
> signif() makes indeed more sense than round(). Using 20 digits gives
> identical results, 21 digits gives non identical results.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to
> say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of
> data. ~ John Tukey
>
> 2015-06-03 18:09 GMT+02:00 Duncan Murdoch <murdoch.duncan at gmail.com
> <mailto:murdoch.duncan at gmail.com>>:
>
> On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:
> > Dear all,
> >
> > I'm a bit puzzled by the difference in an object when created in R
> 32-bit
> > and R 64-bit.
> >
> > Consider the code below. test.rda is available at
> >
> https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
> >
> > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> > library(lme4)
> > load("test.rda")
> > coef.32 <- coef(test)
> > save(coef.32, file = "32bit.rda")
> >
> > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> > library(lme4)
> > load("~/test.rda")
> > coef.64 <- coef(test)
> > save(coef.64, file = "64bit.rda")
> >
> >
> > # Compare the results
> > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> > library(lme4)
> > load("32bit.rda")
> > load("64bit.rda")
> > identical(coef.32, coef.64) # FALSE
> > identical(coef.32$fRow, coef.64$fRow) # FALSE
> > identical(coef.32$fLocation, coef.64$fLocation) # TRUE
> > identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
> >
> > The first comparison is FALSE, because the second is FALSE. But
> why is the
> > second FALSE and the third and fourth TRUE?
> >
> > My goal is the calculate a SHA1 hash on the coef(test) to track if the
> > coefficients of test have changed. I'd like to get the same hash on a
> > 32-bit and 64-bit system. A simple hack would be to calculate the
> hash on
> > round(coef(test), 20). Is that a good or bad idea?
> >
> > identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE
>
> Different math libraries round differently, so small differences are
> expected. This is FAQ 7.31. In many cases the 32 bit calculations are
> more accurate, because they tend to use more 80 bit extended precision
> intermediate values, but that is not guaranteed.
>
> Rounding before comparing makes sense, but I would use signif() instead
> of round(), I would choose a relatively small number of significant
> digits, and I would expect to see a few false positives: if the true
> value is 0 but some "random" noise is added, I'd expect values rounded
> by signif() to be unequal.
>
> Duncan Murdoch
>
> >
> > Best regards,
> >
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> > Forest
> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> > Kliniekstraat 25
> > 1070 Anderlecht
> > Belgium
> >
> > To call in the statistician after the experiment is done may be no more
> > than asking him to perform a post-mortem examination: he may be able to say
> > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does not
> > ensure that a reasonable answer can be extracted from a given body of data.
> > ~ John Tukey
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org <mailto:R-help at r-project.org> mailing list --
> To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
More information about the R-help
mailing list