[R-pkg-devel] Advice for addressing CRAN rejection
smallepsilon
@m@||ep@||on @end|ng |rom proton@me
Wed May 14 17:18:27 CEST 2025
Ben,
No need to apologize. I hope the following example helps clarify what I mean. Suppose that modify_matrix(mat, other_args) is a function that, among other things, applies eigen() to mat. For good reasons, other_args has no default value. It is sometimes convenient, though, to supply the user with default values. Therefore, there is another function, convenient_modify_matrix():
convenient_modify_matrix(mat) <- function(mat) modify_matrix(mat, other_args = default).
To help verify that the code is correct, I want to check the result from identical() below:
set.seed(20250514)
A <- modify_matrix(mat, other_args = default)
set.seed(20250514)
B <- convenient_modify_matrix(mat)
identical(A, B)
In an ideal world, the result would be TRUE. The results can differ, though, because of multithreading (synonymous with parallel computation, yes?) used by the code underlying eigen(). As I understand it, this occurs when calling LAPACK/BLAS routines on some systems (e.g., MKL). Can that multithreading be turned off? Searching online shows that there is a lot of interest in turning it on; not so much in turning it off.
Thanks,
Jesse
On Tuesday, May 13th, 2025 at 8:07 PM, Ben Bolker <bbolker using gmail.com> wrote:
>
>
> Hmm. The thread you linked to is specifically an issue with
> non-deterministic linear algebra, the solution to which is to disable
> threaded computations. I don't think CRAN multithreads by default (and I
> don't know if they test on MKL at all ...?)
>
> Can you provide more specific/concrete examples of the tests? (Again,
> I apologize if there were examples posted up-thread -- I'm too lazy to
> search for them.) I'm not quite sure I understand your comment about
>
> > Suppose, for example, that X is a symmetric, positive definite
>
> matrix. Then identical() will usually distinguish between (X^1/2)^-1 and
> (X^-1)^1/2 (the kind of thing I want to be able to check) while
> all.equal() will generally not
>
>
> What is X^1/2? (There are infinitely many ways to take a matrix
> square root ...) Interpreting X^(1/2) as chol(X) and X^(-1) as solve(X),
> these are not even close:
>
> > set.seed(101); m <- crossprod(matrix(rnorm(9), 3, 3))
>
> > all.equal(solve(chol(m)), chol(solve(m)))
>
> [1] "Mean relative difference: 0.6655765"
>
>
> In general "convenience shortcuts" that do any kind of rearranging of
> a floating point computation cannot be guaranteed to be identical;
> this is a corollary of
>
> https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
>
> See also
> https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal/9508558#9508558
>
> (e.g., floating-point addition is not associative)
>
> I apologize if this sounds basic/is telling you something you already
> know, but from what I can understand of your questions so far, you are
> asking for something that is not possible in general.
>
> Can you clarify further please?
>
> cheers
> Ben Bolker
>
>
>
> On 5/13/25 15:08, smallepsilon wrote:
>
> > Ben,
> >
> > The thread to which I alluded is here: https://stat.ethz.ch/pipermail/r-help/2025-May/480866.html
> >
> > Further clarification: The package provides some convenience shortcuts for the user which should run the same calculations as their longer counterparts. I want to use identical() to provide strong evidence that this is happening. Suppose, for example, that X is a symmetric, positive definite matrix. Then identical() will usually distinguish between (X^1/2)^-1 and (X^-1)^1/2 (the kind of thing I want to be able to check) while all.equal() will generally not (unless I set the tolerance sufficiently low, but that is just making all.equal() behave more like identical()). Using all.equal() helps detect catastrophic errors, but those would be detected in other tests already.
> >
> > Thanks,
> >
> > Jesse
> >
> > On Tuesday, May 13th, 2025 at 1:41 PM, Ben Bolker bbolker using gmail.com wrote:
> >
> > > Can you please clarify (maybe by linking back to an earlier thread, don't remember if you discussed this previously) what you mean by "I realized that because all.equal() does not test (even as a proxy) that the same calculations were done"?
> > >
> > > On Tue, May 13, 2025, 1:05 PM smallepsilon smallepsilon using proton.me wrote:
> > >
> > > > I have been trying to fix some issues with my package's testing on CRAN, which culminated in a rejection email from a CRAN administrator that I am not sure how to address.
> > > >
> > > > The first issues arose with MKL. (I got helpful information about that recently on r-help.) In many package tests, I want to verify that two ways of specifying something lead to the execution of exactly the same calculations. I use identical() as a proxy for this, but because numeric results are not necessarily reproducible when using parallel processing, this does not work on all platforms.
> > > >
> > > > My initial attempts to address this involved replacing the offending identical() calls with all.equal() calls. After two or three such attempts, I realized that because all.equal() does not test (even as a proxy) that the same calculations were done, it is impractical and unnecessary to run these tests on all of the CRAN platforms. I moved the original test files to a separate folder on my computer so I can run them all locally. (My assumption is that if the logic is correct on my computer, then it is correct on all of them, and identical() helps verify this.) In the newest package version uploaded to CRAN, I included the tests that verify the essential functionality of the package so that the crucial output values are the same on all platforms, up to a reasonable number of significant digits. These are the tests that are clearly important to run on all platforms.
> > > >
> > > > My submission was rejected, not because of test failures, but because I had "removed the failing tests which is not the idea of tests." No errors/warnings/notes were reported to me. The only option I have been given is to replace identical() with all.equal(), which defeats the purpose of these particular tests.
> > > >
> > > > I replied to the administrator's email with a brief version of all of this, but have not gotten a response. Any advice on what else I could do would be appreciated.
> > > >
> > > > Thanks,
> > > >
> > > > Jesse
> > > >
> > > > ______________________________________________
> > > > R-package-devel using r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
>
> --
> Dr. Benjamin Bolker
> Professor, Mathematics & Statistics and Biology, McMaster University
> Director, School of Computational Science and Engineering
> * E-mail is sent at my convenience; I don't expect replies outside of
> working hours.
More information about the R-package-devel
mailing list