[R-pkg-devel] mvrnorm, eigen, tests, and R CMD check
Ben Bolker
bbolker @ending from gm@il@com
Thu May 17 18:07:23 CEST 2018
As far as questions #1 and #2 go: you can probably use the components
of R.Version() (e.g. $arch, maybe some substring of $os) to compare
your test output only for "sufficiently similar" platforms. Depending
on how obsessive you are you could generate different test output for
a bunch of platforms and include them all. Might be cleaner to store
the results in a separate file. For example, say that
inst/testdata/eigentests.RData contains a list of results - then
something like
eigen_results <- list(
x86_64_darwin= ...,
x86_64_linux = ...,
x86_64_windows= ...,
x86_32_darwin=...,
etc.)
load(system.file("testdata","eigentests.RData",package="mypkg"))
rv <- R.Version()
platformname <- paste(rv$arch,gsub(rv$os,"^([[:alpha:]]+)","\\1"),sep="_")
expected <- eigen_results[[platformname]]
should extract the correct version for the platform currently being tested.
On Thu, May 17, 2018 at 11:53 AM, Martin Maechler
<maechler at stat.math.ethz.ch> wrote:
>>>>>> Kevin Coombes <kevin.r.coombes at gmail.com>
>>>>>> on Thu, 17 May 2018 11:21:23 -0400 writes:
>
> > Hi, I wrote and maintain the Thresher package. It includes
> > code to do simulations. In the "tests" directory of the
> > package, I do some simple simulations and run the main
> > algorithm, then write out summaries of the results
>
> > The initial submission of the package to CRAN was delayed
> > because the "Rout.save" files matched the "Rout" files on
> > 64-bit R but *not* on 32-bit R on Windows. After
> > investigating, I realized that when my simulation code
> > called "MASS::mvrnorm", I got different results from
> > 64-bit and 32-bit versions of R on the same machine.
> > Pushing further, I determined that this was happening
> > because mvrnorm used "eigen" to compute the eigenvalues
> > and eigenvectors, and "eigen" itself gave different
> > answers in the two R versions..
>
> > The underlying issue (mathematically) is that the
> > correlation/covariance matrix I was using had repeated
> > eigenvalues, and so there is no unique choice of basis for
> > the associated eigenspace. This observation suggests that
> > the issue is potentially more general than 32-bit versus
> > 64-bit; the results will depend on the implementation of
> > the eigen-decomposition in whatever linear algebra module
> > is compiled along with R, so it can change from machine to
> > machine.
>
> > I "solved" (well, worked around) the immediate problem
> > with package submission by changing the test code to not
> > write out anything that might differ between versions.
>
> > With all of that as background, here are my main
> > questions:
>
> > [1] Is there any way to put something into the "tests"
> > directory that would allow me to use these simulations for
> > what computer scientists call regression testing? (That
> > is, to make sure my changes to the code haven't changed
> > results in an unexpected way.)
>
> > [2] Should there be a flag or instruction to R CMD check
> > that says to only run or interpret this particular test on
> > a specific version or machine? (Or is there already such a
> > flag that I don't know about?)
>
> > [3] Should the documentation (man page) for "eigen" or
> > "mvrnorm" include a warning that the results can change
> > from machine to machine (or between things like 32-bit and
> > 64-bit R on the same machine) because of difference in
> > linear algebra modules? (Possibly including the statement
> > that "set.seed" won't save you.)
>
> The problem is that most (young?) people do not read help pages
> anymore.
>
> help(eigen) has contained the following text for years, and in
> spite of your good analysis of the problem you seem to not have
> noticed the last semi-paragraph:
>
>> Value:
>>
>> The spectral decomposition of ‘x’ is returned as a list with
>> components
>>
>> values: a vector containing the p eigenvalues of ‘x’, sorted in
>> _decreasing_ order, according to ‘Mod(values)’ in the
>> asymmetric case when they might be complex (even for real
>> matrices). For real asymmetric matrices the vector will be
>> complex only if complex conjugate pairs of eigenvalues are
>> detected.
>>
>> vectors: either a p * p matrix whose columns contain the eigenvectors
>> of ‘x’, or ‘NULL’ if ‘only.values’ is ‘TRUE’. The vectors
>> are normalized to unit length.
>>
>> Recall that the eigenvectors are only defined up to a
>> constant: even when the length is specified they are still
>> only defined up to a scalar of modulus one (the sign for real
>> matrices).
>
> It's not a warning but a "recall that" .. maybe because the
> author already assumed that only thorough users would read that
> and for them it would be a recall of something they'd have
> learned *and* not entirely forgotten since ;-)
>
> Martin Maechler
> ETH Zurich
>
> ______________________________________________
> R-package-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
More information about the R-package-devel
mailing list