[R-pkg-devel] Canonical way to Rprintf R_xlen_t

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Tue Nov 28 22:21:00 CET 2023


On 11/28/23 21:50, Henrik Bengtsson wrote:
> Daniel, I get those compiler warnings for '%td" MS Windows. It works
> fine on Linux.

Please let me clarify. %td works in R on Windows in R 4.3 and R-devel, 
when using the recommended toolchain, which is Rtools43. It also worked 
with R 4.2 and Rtools42. It works since R has switched to UCRT on 
Windows. I assume you are not using a recommended toolchain and this is 
why you are getting the warning - please let me know if this is not the 
case and I will try to help.

There is a bug in GCC, still present in gcc 12 and gcc 10, due to which 
gcc displays warnings about the format even when it is supported. The 
details are complicated, but in short, it accidentally applies both 
Microsoft format and C99/GNU format checks to printf functions with UCRT 
- so you get a warning whenever the two formats disagree, which includes 
printing a 64 bit integer.  Also for %td which is not supported by 
Microsoft format. Or say %zu (size_t) or %Lf (long double). I've been 
patching GCC in Rtools42 and Rtools43 to avoid this problem, so you 
don't get the warning there. My patch has been picked up also by Msys2, 
I didn't check whether it is still there or not. Finally a new 
implementation of the patch was accepted to GCC trunk, so eventually 
this will no longer be needed. But regardless which version of GCC 
Rtools44 will use, I will make sure it will accept C99 printf formats 
without warnings. An unpatched GCC 10 or 12 with UCRT will print a 
warning for %td but will support it.

Best
Tomas

> FYI, https://builder.r-hub.io/ is a great, free service for testing on
> various platforms in the cloud.  Also, if you host your package code
> on GitHub, it's a small step to configure GitHub Actions to check your
> packages across platforms on their servers.  It's free and fairly
> straightforward.  There should be plenty of tutorials and examples
> online for how to do that with R packages.  So, no need to mock around
> with Linux containers etc.
>
> /Henrik
>
> On Tue, Nov 28, 2023 at 12:30 PM Daniel Kelley <kelley using dal.ca> wrote:
>> To HB: I also maintain a package that has this problem.  I do not have access to a linux machine (or a machine with the C++ version in question) so I spent quite a while trying to get docker set up. That was a slow process because I had to install R, a bunch of packages, some other software, and so forth.  Anyway, the docker container I had used didn't seem to have a compiler that gave these warnings.  But, by then, I saw that the machine used by
>>
>> devtools::check_win_devel()
>>
>> was giving those warnings :-)
>>
>> So, now there is a way to debug these things.
>>
>> PS. I also tried using rhub, but it takes a long time and often results in a PREPERROR.
>>
>> On Nov 28, 2023, at 3:58 PM, Henrik Bengtsson <henrik.bengtsson using gmail.com> wrote:
>>
>> CAUTION: The Sender of this email is not from within Dalhousie.
>>
>> "%td" is not supported on all platforms/compilers.  This is what I got
>> when I added it to 'matrixStats';
>>
>> * using log directory 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck'
>> * using R Under development (unstable) (2023-11-26 r85638 ucrt)
>> * using platform: x86_64-w64-mingw32
>> * R was compiled by
>> gcc.exe (GCC) 12.3.0
>> GNU Fortran (GCC) 12.3.0
>> * running under: Windows Server 2022 x64 (build 20348)
>> * using session charset: UTF-8
>> * using options '--no-manual --as-cran'
>> * checking for file 'matrixStats/DESCRIPTION' ... OK
>> * this is package 'matrixStats' version '1.1.0-9003'
>> * checking package namespace information ... OK
>> * checking package dependencies ... OK
>> * checking if this is a source package ... OK
>> * checking if there is a namespace ... OK
>> * checking for executable files ... OK
>> * checking for hidden files and directories ... OK
>> * checking for portable file names ... OK
>> * checking serialization versions ... OK
>> * checking whether package 'matrixStats' can be installed ... [22s] WARNING
>> Found the following significant warnings:
>> binCounts.c:25:81: warning: unknown conversion type character 't' in
>> format [-Wformat=]
>> binCounts.c:25:11: warning: too many arguments for format [-Wformat-extra-args]
>> binMeans.c:26:60: warning: unknown conversion type character 't' in
>> format [-Wformat=]
>> binMeans.c:26:67: warning: unknown conversion type character 't' in
>> format [-Wformat=]
>> ...
>> See 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck/00install.out'
>> for details.
>> * used C compiler: 'gcc.exe (GCC) 12.2.0'
>>
>> It worked fine on Linux. Because of this, I resorted to the coercion
>> strategy, i.e. "%lld" and (long long int)value.  FWIW, on MS Windows,
>> I see 'ptrsize_t' being 'long long int', whereas on Linux I see 'long
>> int'.
>>
>> /Henrik
>>
>> On Tue, Nov 28, 2023 at 11:51 AM Ivan Krylov <krylov.r00t using gmail.com> wrote:
>>
>>
>> On Wed, 29 Nov 2023 06:11:23 +1100
>> Hugh Parsonage <hugh.parsonage using gmail.com> wrote:
>>
>> Rprintf("%lld", (long long) xlength(x));
>>
>>
>> This is fine. long longs are guaranteed to be at least 64 bits in size
>> and are signed, just like lengths in R.
>>
>> Rprintf("%td, xlength(x));
>>
>>
>> Maybe if you cast it to ptrdiff_t first. Otherwise I would expect this
>> to fail on an (increasingly rare) 32-bit system where R_xlen_t is int
>> (which is an implementation detail).
>>
>> In my opinion, ptrdiff_t is just the right type for array lengths if
>> they have to be signed (which is useful for Fortran interoperability),
>> so Rprintf("%td", (ptrdiff_t)xlength(x)) would be my preferred option
>> for now. By definition of ptrdiff_t, you can be sure [*] that there
>> won't be any vectors on your system longer than PTRDIFF_MAX.
>>
>> using the string macro found in Mr Kalibera's commit of r85641:
>> R_PRIdXLEN_T
>>
>>
>> I think this will be the best solution once we can afford
>> having our packages depend on R >= 4.4.
>>
>> --
>> Best regards,
>> Ivan
>>
>> [*] https://en.cppreference.com/w/c/types/ptrdiff_t posits that there
>> may exist long vectors that fit in SIZE_MAX (unsigned) elements but not
>> PTRDIFF_MAX (signed) elements. If such vector exists, subtracting two
>> pointers to its insides may result in undefined behaviour. This may be
>> already possible in a 32-bit process on Linux running with a 3G
>> user-space / 1G kernel-space split. The only way around the problem is
>> to use unsigned types for lengths, but that would preclude Fortran
>> compatibility.
>>
>> ______________________________________________
>> R-package-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>>
>> ______________________________________________
>> R-package-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list