[Rd] boolean and logical types -draft
Prof Brian Ripley
r|p|eybd @end|ng |rom |c|oud@com
Mon Feb 3 18:39:07 CET 2025
Sent in error (and not moderated).
On 03/02/2025 17:36, Prof Brian Ripley via R-devel wrote:
> Tomas,
>
> I am thinking of writing something for R-devel, and hope to have your
> input first.
>
> I get moderated on R-devel as I am now subscribed as brian.ripley using R-
> project.org which of course I cannot send from. So I am even more
> discouraged from posting there. (R-core is bad enough with Luke
> discouraging all innovation except by him and Simon completely
> misunderstanding the C23 status.)
>
> Thanks,
>
> Brian
>
> ----------------
>
> There are several of these, and few guarantees for inter-working.
>
> a) R's logical vectors, which include a value NA for its elements.
> b) R's Rboolean type in C/C++
>
> c) C++'s bool type
> d) C23's bool type
> e) C99's _Bool type to which bool is aliased if <stdbool.h> is included.
> f) Fortran's LOGICAL type
>
> a) is currently implemented as a C int (so 32-bit) type with NA as the C
> value NA_LOGICAL which is the same a NA_INTEGER.
>
> b) is currently implemented as a C enum with two values. I don't know
> of any guarantees on how that is stored except in char or an integer
> type -- however it seems common practice to use a 32-bit type (int or
> unsigned int would not be distinguishable). (C23 §6.7.3.3) Enums can
> have a specified data type, but we do not.
>
> C23 states that bool has 1 value bit and some padding bits (§6.2.6.2) so
> it can be stored in char-sized storage (i.e. bytes) or multiples
> thereof. And that _Bool is a alternative name for bool.
>
> f) is complier-dependent: for interoperability with C or R, code should
> use c_bool from iso_c_binding (Fortran 2003). Fortran compilers store
> LOGICAL in compiler-dependent ways, and for a long time we got away with
> assuming that was equivalent to int (so LOGICAL values could be passed
> to and from with int* on the C/R side). But sometime around GCC 8 they
> changed to int_least32_t, which on common platforms is the same as int
> but does not need to be.
>
> It seems that in all cases coercion to an integer type coerces false
> values to 0 and true values to 1 (and this is guaranteed by C23 at
> least). And C23 guarantees that when coercing from an integer type to
> bool zero values are coerced to false and non-zero ones to true (bool is
> 'an unsigned integer type'). However, that does not seem to be true for
> C++ as UB sanitizers warn on coercing values other than 0/1.
>
> I believe it to be the intention that c), d) and e) have the same
> representation and interwork using the same compiler, but I could not
> find that documented and see signs that e) might differ in C17 and C23
> modes.
>
> ----------------
>
> I need to look again at the C and C++ standards which with my vision I
> need to do in very small chunks. Oh for the vision I once had!
>
--
Brian D. Ripley, ripley using stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
More information about the R-devel
mailing list