[R-pkg-devel] mvrnorm, eigen, tests, and R CMD check

Ben Bolker bbolker @ending from gm@il@com
Thu May 17 18:07:23 CEST 2018


As far as questions #1 and #2 go: you can probably use the components
of R.Version() (e.g. $arch, maybe some substring of $os) to compare
your test output only for "sufficiently similar" platforms. Depending
on how obsessive you are you could generate different test output for
a bunch of platforms and include them all.  Might be cleaner to store
the results in a separate file.  For example, say that
inst/testdata/eigentests.RData contains a list of results - then
something like

eigen_results <- list(
   x86_64_darwin= ...,
   x86_64_linux = ...,
   x86_64_windows= ...,
   x86_32_darwin=...,
  etc.)

load(system.file("testdata","eigentests.RData",package="mypkg"))
rv <- R.Version()
platformname <- paste(rv$arch,gsub(rv$os,"^([[:alpha:]]+)","\\1"),sep="_")
expected <- eigen_results[[platformname]]

should extract the correct version for the platform currently being tested.



On Thu, May 17, 2018 at 11:53 AM, Martin Maechler
<maechler at stat.math.ethz.ch> wrote:
>>>>>> Kevin Coombes <kevin.r.coombes at gmail.com>
>>>>>>     on Thu, 17 May 2018 11:21:23 -0400 writes:
>
>     > Hi, I wrote and maintain the Thresher package. It includes
>     > code to do simulations. In the "tests" directory of the
>     > package, I do some simple simulations and run the main
>     > algorithm, then write out summaries of the results
>
>     > The initial submission of the package to CRAN was delayed
>     > because the "Rout.save" files matched the "Rout" files on
>     > 64-bit R but *not* on 32-bit R on Windows. After
>     > investigating, I realized that when my simulation code
>     > called "MASS::mvrnorm", I got different results from
>     > 64-bit and 32-bit versions of R on the same machine.
>     > Pushing further, I determined that this was happening
>     > because mvrnorm used "eigen" to compute the eigenvalues
>     > and eigenvectors, and "eigen" itself gave different
>     > answers in the two R versions..
>
>     > The underlying issue (mathematically) is that the
>     > correlation/covariance matrix I was using had repeated
>     > eigenvalues, and so there is no unique choice of basis for
>     > the associated eigenspace. This observation suggests that
>     > the issue is potentially more general than 32-bit versus
>     > 64-bit; the results will depend on the implementation of
>     > the eigen-decomposition in whatever linear algebra module
>     > is compiled along with R, so it can change from machine to
>     > machine.
>
>     > I "solved" (well, worked around) the immediate problem
>     > with package submission by changing the test code to not
>     > write out anything that might differ between versions.
>
>     > With all of that as background, here are my main
>     > questions:
>
>     > [1] Is there any way to put something into the "tests"
>     > directory that would allow me to use these simulations for
>     > what computer scientists call regression testing? (That
>     > is, to make sure my changes to the code haven't changed
>     > results in an unexpected way.)
>
>     > [2] Should there be a flag or instruction to R CMD check
>     > that says to only run or interpret this particular test on
>     > a specific version or machine? (Or is there already such a
>     > flag that I don't know about?)
>
>     > [3] Should the documentation (man page) for "eigen" or
>     > "mvrnorm" include a warning that the results can change
>     > from machine to machine (or between things like 32-bit and
>     > 64-bit R on the same machine) because of difference in
>     > linear algebra modules? (Possibly including the statement
>     > that "set.seed" won't save you.)
>
> The problem is that most (young?) people do not read help pages
> anymore.
>
> help(eigen) has contained the following text for years, and in
> spite of your good analysis of the problem you seem to not have
> noticed the last semi-paragraph:
>
>> Value:
>>
>>      The spectral decomposition of ‘x’ is returned as a list with
>>      components
>>
>>   values: a vector containing the p eigenvalues of ‘x’, sorted in
>>           _decreasing_ order, according to ‘Mod(values)’ in the
>>           asymmetric case when they might be complex (even for real
>>           matrices).  For real asymmetric matrices the vector will be
>>           complex only if complex conjugate pairs of eigenvalues are
>>           detected.
>>
>>  vectors: either a p * p matrix whose columns contain the eigenvectors
>>           of ‘x’, or ‘NULL’ if ‘only.values’ is ‘TRUE’.  The vectors
>>           are normalized to unit length.
>>
>>           Recall that the eigenvectors are only defined up to a
>>           constant: even when the length is specified they are still
>>           only defined up to a scalar of modulus one (the sign for real
>>           matrices).
>
> It's not a warning but a "recall that" .. maybe because the
> author already assumed that only thorough users would read that
> and for them it would be a recall of something they'd have
> learned *and* not entirely forgotten since ;-)
>
> Martin Maechler
> ETH Zurich
>
> ______________________________________________
> R-package-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list