[R-pkg-devel] Canonical way to Rprintf R_xlen_t

Wed Nov 29 00:14:46 CET 2023

On 11/28/23 22:21, Tomas Kalibera wrote:
>
> On 11/28/23 21:50, Henrik Bengtsson wrote:
>> Daniel, I get those compiler warnings for '%td" MS Windows. It works
>> fine on Linux.
>
> Please let me clarify. %td works in R on Windows in R 4.3 and R-devel, 
> when using the recommended toolchain, which is Rtools43. It also 
> worked with R 4.2 and Rtools42. It works since R has switched to UCRT 
> on Windows. I assume you are not using a recommended toolchain and 
> this is why you are getting the warning - please let me know if this 
> is not the case and I will try to help.

I forgot to say that one needs R-devel 85640 or newer. You could get the 
warning for %td etc also with R-devel between 85639 and R-devel 85619, 
even with the recommended toolchain, because these versions used 
"printf" format in the header, but on Windows it needs to use 
"gnu_printf" (which is the C99/UCRT format). I've fixed that in 85640.

Best
Tomas

>
> There is a bug in GCC, still present in gcc 12 and gcc 10, due to 
> which gcc displays warnings about the format even when it is 
> supported. The details are complicated, but in short, it accidentally 
> applies both Microsoft format and C99/GNU format checks to printf 
> functions with UCRT - so you get a warning whenever the two formats 
> disagree, which includes printing a 64 bit integer.  Also for %td 
> which is not supported by Microsoft format. Or say %zu (size_t) or %Lf 
> (long double). I've been patching GCC in Rtools42 and Rtools43 to 
> avoid this problem, so you don't get the warning there. My patch has 
> been picked up also by Msys2, I didn't check whether it is still there 
> or not. Finally a new implementation of the patch was accepted to GCC 
> trunk, so eventually this will no longer be needed. But regardless 
> which version of GCC Rtools44 will use, I will make sure it will 
> accept C99 printf formats without warnings. An unpatched GCC 10 or 12 
> with UCRT will print a warning for %td but will support it.
>
> Best
> Tomas
>
>> FYI, https://builder.r-hub.io/ is a great, free service for testing on
>> various platforms in the cloud.  Also, if you host your package code
>> on GitHub, it's a small step to configure GitHub Actions to check your
>> packages across platforms on their servers.  It's free and fairly
>> straightforward.  There should be plenty of tutorials and examples
>> online for how to do that with R packages.  So, no need to mock around
>> with Linux containers etc.
>>
>> /Henrik
>>
>> On Tue, Nov 28, 2023 at 12:30 PM Daniel Kelley <kelley using dal.ca> wrote:
>>> To HB: I also maintain a package that has this problem.  I do not 
>>> have access to a linux machine (or a machine with the C++ version in 
>>> question) so I spent quite a while trying to get docker set up. That 
>>> was a slow process because I had to install R, a bunch of packages, 
>>> some other software, and so forth.  Anyway, the docker container I 
>>> had used didn't seem to have a compiler that gave these warnings. 
>>> But, by then, I saw that the machine used by
>>>
>>> devtools::check_win_devel()
>>>
>>> was giving those warnings :-)
>>>
>>> So, now there is a way to debug these things.
>>>
>>> PS. I also tried using rhub, but it takes a long time and often 
>>> results in a PREPERROR.
>>>
>>> On Nov 28, 2023, at 3:58 PM, Henrik Bengtsson 
>>> <henrik.bengtsson using gmail.com> wrote:
>>>
>>> CAUTION: The Sender of this email is not from within Dalhousie.
>>>
>>> "%td" is not supported on all platforms/compilers.  This is what I got
>>> when I added it to 'matrixStats';
>>>
>>> * using log directory 
>>> 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck'
>>> * using R Under development (unstable) (2023-11-26 r85638 ucrt)
>>> * using platform: x86_64-w64-mingw32
>>> * R was compiled by
>>> gcc.exe (GCC) 12.3.0
>>> GNU Fortran (GCC) 12.3.0
>>> * running under: Windows Server 2022 x64 (build 20348)
>>> * using session charset: UTF-8
>>> * using options '--no-manual --as-cran'
>>> * checking for file 'matrixStats/DESCRIPTION' ... OK
>>> * this is package 'matrixStats' version '1.1.0-9003'
>>> * checking package namespace information ... OK
>>> * checking package dependencies ... OK
>>> * checking if this is a source package ... OK
>>> * checking if there is a namespace ... OK
>>> * checking for executable files ... OK
>>> * checking for hidden files and directories ... OK
>>> * checking for portable file names ... OK
>>> * checking serialization versions ... OK
>>> * checking whether package 'matrixStats' can be installed ... [22s] 
>>> WARNING
>>> Found the following significant warnings:
>>> binCounts.c:25:81: warning: unknown conversion type character 't' in
>>> format [-Wformat=]
>>> binCounts.c:25:11: warning: too many arguments for format 
>>> [-Wformat-extra-args]
>>> binMeans.c:26:60: warning: unknown conversion type character 't' in
>>> format [-Wformat=]
>>> binMeans.c:26:67: warning: unknown conversion type character 't' in
>>> format [-Wformat=]
>>> ...
>>> See 
>>> 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck/00install.out'
>>> for details.
>>> * used C compiler: 'gcc.exe (GCC) 12.2.0'
>>>
>>> It worked fine on Linux. Because of this, I resorted to the coercion
>>> strategy, i.e. "%lld" and (long long int)value.  FWIW, on MS Windows,
>>> I see 'ptrsize_t' being 'long long int', whereas on Linux I see 'long
>>> int'.
>>>
>>> /Henrik
>>>
>>> On Tue, Nov 28, 2023 at 11:51 AM Ivan Krylov <krylov.r00t using gmail.com> 
>>> wrote:
>>>
>>>
>>> On Wed, 29 Nov 2023 06:11:23 +1100
>>> Hugh Parsonage <hugh.parsonage using gmail.com> wrote:
>>>
>>> Rprintf("%lld", (long long) xlength(x));
>>>
>>>
>>> This is fine. long longs are guaranteed to be at least 64 bits in size
>>> and are signed, just like lengths in R.
>>>
>>> Rprintf("%td, xlength(x));
>>>
>>>
>>> Maybe if you cast it to ptrdiff_t first. Otherwise I would expect this
>>> to fail on an (increasingly rare) 32-bit system where R_xlen_t is int
>>> (which is an implementation detail).
>>>
>>> In my opinion, ptrdiff_t is just the right type for array lengths if
>>> they have to be signed (which is useful for Fortran interoperability),
>>> so Rprintf("%td", (ptrdiff_t)xlength(x)) would be my preferred option
>>> for now. By definition of ptrdiff_t, you can be sure [*] that there
>>> won't be any vectors on your system longer than PTRDIFF_MAX.
>>>
>>> using the string macro found in Mr Kalibera's commit of r85641:
>>> R_PRIdXLEN_T
>>>
>>>
>>> I think this will be the best solution once we can afford
>>> having our packages depend on R >= 4.4.
>>>
>>> -- 
>>> Best regards,
>>> Ivan
>>>
>>> [*] https://en.cppreference.com/w/c/types/ptrdiff_t posits that there
>>> may exist long vectors that fit in SIZE_MAX (unsigned) elements but not
>>> PTRDIFF_MAX (signed) elements. If such vector exists, subtracting two
>>> pointers to its insides may result in undefined behaviour. This may be
>>> already possible in a 32-bit process on Linux running with a 3G
>>> user-space / 1G kernel-space split. The only way around the problem is
>>> to use unsigned types for lengths, but that would preclude Fortran
>>> compatibility.
>>>
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>>
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>>
>> ______________________________________________
>> R-package-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel