[R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks
Ivan Krylov
|kry|ov @end|ng |rom d|@root@org
Tue Nov 12 11:56:34 CET 2024
В Tue, 12 Nov 2024 04:02:17 +0000
Mauricio Vargas Sepulveda <m.sepulveda using mail.utoronto.ca> пишет:
> For v2.0.3 I implemented a full refactor, where I simplified the code
> considerably while trying to solve what looks to be a false positive.
>
> Now I implemented a "patch" based on the suggestion:
I see the change on GitHub [1], made back on October 22nd, but I am not
seeing these checks in your latest submission to CRAN [2] from November
7th. Before you have implemented [1], have you been able to reproduce
the errors that your package received on CRAN, that is, the
container-overflow and the null object pointers?
I've just noticed that FuzzyVariableParser::ParseAllVariables is using
std::thread::hardware_concurrency(). CRAN uses a shared computer with
many cores to test many packages at the same time, so the code most
likely needs to limit itself to two threads during tests and examples
[3].
> SUMMARY: AddressSanitizer: alloc-dealloc-mismatch
> (/opt/R/devel-asan/lib/R/bin/exec/R+0xc6306) (BuildId:
> 178e357df79b1589a38c1949da5e5f022d4bb535) in free
This is an error you're getting when testing with an R-hub image. R-hub
is not CRAN. You've performed good investigative work to reduce the
issue you're seeing on R-hub for the StackOverflow question [4], but it
looks like this particular problem is already reported to R-hub
developers [5] and it's not what got your latest CRAN submission
archived.
When your latest submission, redatam_2.0.3.tar.gz, was archived, you
should have received an e-mail from CRAN. Was it about the package
failing automatic checks? Were there any other comments? Did the e-mail
highlight any issues in particular?
I do see the problem in [6], but it's hard to diagnose by itself. I was
able to reproduce it (but only after compiling everything with
-shared-san and providing the necessary LD_LIBRARY_PATH, otherwise
packages using C++ failed to load; I also had to manually set
UBSAN_OPTIONS=print_stacktrace=1 to find out where the overflow
originates):
> read_redatam(paste(dout, "cg15.dic", sep = "/"))
Opening dictionary file...
/usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1272:9: runtime error: addition of unsigned offset to 0x7f38f12bcc30 overflowed to 0x7f38f12bcc2f
#0 0x7f38e859a15f in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::operator[](unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1272:9
#1 0x7f38e859a15f in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::back() /usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1353:9
#2 0x7f38e859a15f in RedatamLib::ListExporter::ListExporter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) /root/redatam.Rcheck/00_pkg_src/redatam/src/redatamlib/exporters/RListExporter.cpp:16:21
#3 0x7f38e85a7d60 in RedatamLib::RedatamDatabase::ExportRLists() const /root/redatam.Rcheck/00_pkg_src/redatam/src/redatamlib/entities/RedatamDatabase.cpp:23:16
#4 0x7f38e85b2224 in export_redatam_to_list_(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>) /root/redatam.Rcheck/00_pkg_src/redatam/src/main.cpp:13:15
#5 0x7f38e85b3068 in _redatam_export_redatam_to_list_ /root/redatam.Rcheck/00_pkg_src/redatam/src/cpp11.cpp:12:27
#6 0x55b5da67bf6a in R_doDotCall /root/R/src/main/../../../R-svn/src/main/dotcode.c:754:11
The corresponding lines of the source code are:
ListExporter::ListExporter(const std::string &outputDirectory)
: m_path(outputDirectory) {
if ('/' != m_path.back()) { // <-- here
m_path.append("/");
}
}
Looks like the code is calling .back() on an empty string. This also
needs to be prevented somehow.
To summarise, I recommend to do the following for the next submission:
- make sure the fix for the UBSan issues from [7] is included and note
it in the submission comments
- avoid using more than two threads during R CMD check
- fix the empty string issue from [6] and note it in the submission
comments
- fix any other issues noted in the CRAN correspondence and note it in
the submission comments
- do not mention the alloc-free-mismatch problem in the submission
comments
Good luck! I hope this helps and there won't be any more issues hiding
in the code.
--
Best regards,
Ivan
[1]
https://github.com/pachadotdev/open-redatam/commit/5399332f26fde8029cc7f78e0f889dfd983504c2
[2]
https://cran.r-project.org/incoming/archive/redatam_2.0.3.tar.gz
[3]
https://contributor.r-project.org/cran-cookbook/code_issues.html#using-more-than-2-cores
[4]
https://stackoverflow.com/q/79171799
[5]
https://github.com/r-hub/rhub/issues/598
There's no actual alloc-free-mismatch, it's a false positive due to the
way libc++ (correctly) uses the delete operator:
https://github.com/llvm/llvm-project/issues/59432
[6]
https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241107_154750/specialChecks/clang-san/summary.txt
[7]
https://www.stats.ox.ac.uk/pub/bdr/memtests/clang-ASAN/redatam/00check.log
https://www.stats.ox.ac.uk/pub/bdr/memtests/gcc-UBSAN/redatam/00check.log
More information about the R-package-devel
mailing list