[R-pkg-devel] mvrnorm, eigen, tests, and R CMD check
Facundo Muñoz
f@muvie @ending from gm@il@com
Fri May 18 10:50:18 CEST 2018
In my opinion, the underlying problem is that you are checking whether
the test reproduces exactly your pre-computed solution, while there
actually exist other valid answers.
I believe you want to check whether the sub-spaces are the same, not
whether the bases are identical (which can depend on platform, linear
algebra library, etc.)
ƒacu.-
On 05/17/2018 09:30 PM, Kevin Coombes wrote:
> Yes; I'm pretty sure that it is exactly the repeated eigenvalues that are
> the issue. The matrices I am using are all nonsingular, and the various
> algorithms have no problem computing the eigenvalues correctly (up to
> numerical errors that I can bound and thus account for on tests by rounding
> appropriately). But an eigenvalue of multiplicity M has an M-dimensional
> eigenspace with no preferred basis. So, any M-dimensional (unitary) change
> of basis is permitted. That's what give rise to the lack of reproducibility
> across architectures. The choice of basis appears to use different
> heuristics on 32-bit windows than on 64-bit Windows or Linux machines. As a
> result, I can't include the tests I'd like as part of a CRAN submission.
>
> On Thu, May 17, 2018, 2:29 PM William Dunlap <wdunlap at tibco.com> wrote:
>
>> Your explanation needs to be a bit more general in the case of identical
>> eigenvalues - each distinct eigenvalue has an associated subspace, whose
>> dimension is the number repeats of that eigenvalue and the eigenvectors for
>> that eigenvalue are an orthonormal basis for that subspace. (With no
>> repeated eigenvalues this gives your 'unique up to sign'.)
>>
>> E.g., for the following 5x5 matrix with two eigenvalues of 1 and two of 0
>>
>> > x <- tcrossprod( cbind(c(1,0,0,0,1),c(0,1,0,0,1),c(0,0,1,0,1)) )
>> > x
>> [,1] [,2] [,3] [,4] [,5]
>> [1,] 1 0 0 0 1
>> [2,] 0 1 0 0 1
>> [3,] 0 0 1 0 1
>> [4,] 0 0 0 0 0
>> [5,] 1 1 1 0 3
>> the following give valid but different (by more than sign) eigen vectors
>>
>> e1 <- structure(list(values = c(4, 1, 0.999999999999999, 0,
>> -2.22044607159862e-16
>> ), vectors = structure(c(-0.288675134594813, -0.288675134594813,
>> -0.288675134594813, 0, -0.866025403784439, 0, 0.707106781186547,
>> -0.707106781186547, 0, 0, 0.816496580927726, -0.408248290463863,
>> -0.408248290463863, 0, -6.10622663543836e-16, 0, 0, 0, -1, 0,
>> -0.5, -0.5, -0.5, 0, 0.5), .Dim = c(5L, 5L))), .Names = c("values",
>> "vectors"), class = "eigen")
>> e2 <- structure(list(values = c(4, 1, 1, 0, -2.29037708937563e-16),
>> vectors = structure(c(0.288675134594813, 0.288675134594813,
>> 0.288675134594813, 0, 0.866025403784438, -0.784437556312061,
>> 0.588415847923579, 0.196021708388481, 0, 4.46410900710223e-17,
>> 0.22654886208902, 0.566068420404321, -0.79261728249334, 0,
>> -1.11244069540181e-16, 0, 0, 0, -1, 0, -0.5, -0.5, -0.5,
>> 0, 0.5), .Dim = c(5L, 5L))), .Names = c("values", "vectors"
>> ), class = "eigen")
>>
>> I.e.,
>>> all.equal(crossprod(e1$vectors), diag(5), tol=0)
>> [1] "Mean relative difference: 1.407255e-15"
>>> all.equal(crossprod(e2$vectors), diag(5), tol=0)
>> [1] "Mean relative difference: 3.856478e-15"
>>> all.equal(e1$vectors %*% diag(e1$values) %*% t(e1$vectors), x, tol=0)
>> [1] "Mean relative difference: 1.110223e-15"
>>> all.equal(e2$vectors %*% diag(e2$values) %*% t(e2$vectors), x, tol=0)
>> [1] "Mean relative difference: 9.069735e-16"
>>
>>> e1$vectors
>> [,1] [,2] [,3] [,4] [,5]
>> [1,] -0.2886751 0.0000000 8.164966e-01 0 -0.5
>> [2,] -0.2886751 0.7071068 -4.082483e-01 0 -0.5
>> [3,] -0.2886751 -0.7071068 -4.082483e-01 0 -0.5
>> [4,] 0.0000000 0.0000000 0.000000e+00 -1 0.0
>> [5,] -0.8660254 0.0000000 -6.106227e-16 0 0.5
>>> e2$vectors
>> [,1] [,2] [,3] [,4] [,5]
>> [1,] 0.2886751 -7.844376e-01 2.265489e-01 0 -0.5
>> [2,] 0.2886751 5.884158e-01 5.660684e-01 0 -0.5
>> [3,] 0.2886751 1.960217e-01 -7.926173e-01 0 -0.5
>> [4,] 0.0000000 0.000000e+00 0.000000e+00 -1 0.0
>> [5,] 0.8660254 4.464109e-17 -1.112441e-16 0 0.5
>>
>>
>>
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Thu, May 17, 2018 at 10:14 AM, Martin Maechler <
>> maechler at stat.math.ethz.ch> wrote:
>>
>>>>>>>> Duncan Murdoch ....
>>>>>>>> on Thu, 17 May 2018 12:13:01 -0400 writes:
>>> > On 17/05/2018 11:53 AM, Martin Maechler wrote:
>>> >>>>>>> Kevin Coombes ... on Thu, 17
>>> >>>>>>> May 2018 11:21:23 -0400 writes:
>>>
>>> >> [..................]
>>>
>>> >> > [3] Should the documentation (man page) for "eigen" or
>>> >> > "mvrnorm" include a warning that the results can change
>>> >> > from machine to machine (or between things like 32-bit and
>>> >> > 64-bit R on the same machine) because of difference in
>>> >> > linear algebra modules? (Possibly including the statement
>>> >> > that "set.seed" won't save you.)
>>>
>>> >> The problem is that most (young?) people do not read help
>>> >> pages anymore.
>>> >>
>>> >> help(eigen) has contained the following text for years,
>>> >> and in spite of your good analysis of the problem you
>>> >> seem to not have noticed the last semi-paragraph:
>>> >>
>>> >>> Value:
>>> >>>
>>> >>> The spectral decomposition of ‘x’ is returned as a list
>>> >>> with components
>>> >>>
>>> >>> values: a vector containing the p eigenvalues of ‘x’,
>>> >>> sorted in _decreasing_ order, according to ‘Mod(values)’
>>> >>> in the asymmetric case when they might be complex (even
>>> >>> for real matrices). For real asymmetric matrices the
>>> >>> vector will be complex only if complex conjugate pairs
>>> >>> of eigenvalues are detected.
>>> >>>
>>> >>> vectors: either a p * p matrix whose columns contain the
>>> >>> eigenvectors of ‘x’, or ‘NULL’ if ‘only.values’ is
>>> >>> ‘TRUE’. The vectors are normalized to unit length.
>>> >>>
>>> >>> Recall that the eigenvectors are only defined up to a
>>> >>> constant: even when the length is specified they are
>>> >>> still only defined up to a scalar of modulus one (the
>>> >>> sign for real matrices).
>>> >>
>>> >> It's not a warning but a "recall that" .. maybe because
>>> >> the author already assumed that only thorough users would
>>> >> read that and for them it would be a recall of something
>>> >> they'd have learned *and* not entirely forgotten since
>>> >> ;-)
>>> >>
>>>
>>> > I don't think you're really being fair here: the text in
>>> > ?eigen doesn't make clear that eigenvector values are not
>>> > reproducible even within the same version of R, and
>>> > there's nothing in ?mvrnorm to suggest it doesn't give
>>> > reproducible results.
>>>
>>> Ok, I'm sorry ... I definitely did not want to be unfair.
>>>
>>> I've always thought the remark in eigen was sufficient, but I'm
>>> probably wrong and we should add text explaining that it
>>> practically means that eigenvectors are only defined up to sign
>>> switches (in the real case) and hence results depend on the
>>> underlying {Lapack + BLAS} libraries and therefore are platform
>>> dependent.
>>>
>>> Even further, we could consider (optionally, by default FALSE)
>>> using defining a deterministic scheme for postprocessing the current
>>> output of eigen such that at least for the good cases where all
>>> eigenspaces are 1-dimensional, the postprocessing would result
>>> in reproducible signs, by e.g., ensuring the first non-zero
>>> entry of each eigenvector to be positive.
>>>
>>> MASS::mvrnorm() and mvtnorm::rmvnorm() both use "eigen",
>>> whereas mvtnorm::rmvnorm() *does* have method = "chol" which
>>> AFAIK does not suffer from such problems.
>>>
>>> OTOH, the help page of MASS::mvrnorm() mentions the Cholesky
>>> alternative but prefers eigen for better stability (without
>>> saying more).
>>>
>>> In spite of that, my personal recommendation would be to use
>>>
>>> mvtnorm::rmvnorm(.., method = "chol")
>>>
>>> { or the 2-3 lines of R code to the same thing without an extra package,
>>> just using rnorm(), chol() and simple matrix operations }
>>>
>>> because in simulations I'd expect the var-cov matrix Sigma to
>>> be far enough away from singular for chol() to be stable.
>>>
>>> Martin
>>>
>>> ______________________________________________
>>> R-package-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
[[alternative HTML version deleted]]
More information about the R-package-devel
mailing list