[Rd] is.atomic(NULL) will become FALSE

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Mon Sep 25 17:14:58 CEST 2023


Some of you may remember posts and/or e-mails about this topic
many months ago..

"Atomic vector" is a very well defined term for somewhat advanced
R users.  E.g., from 'The R Language Manual' (in the development
version; i.e. made from latest R source code):
   https://cran.r-project.org/doc/manuals/r-devel/R-lang.html 

Here
  https://cran.r-project.org/doc/manuals/r-devel/R-lang.html#Vector-objects
says

------------------------------------------------------------------------------
2.1.1 Vectors

Vectors can be thought of as contiguous cells containing
data. Cells are accessed through indexing operations such as
x[5]. More details are given in Indexing.

R has six basic (‘atomic’) vector types: logical, integer, real,
complex, character (in C aka ‘string’) and raw. The modes and
storage modes for the different vector types are listed in the
following table.

    typeof      mode      storage.mode

    logical     logical   logical
    integer     numeric   integer
    double      numeric   double
    complex     complex   complex
    character   character character
    raw         raw       raw

Single numbers, such as 4.2, and strings, such as "four point
two" are still vectors, of length 1; there are no more basic
types. Vectors with length zero are possible (and useful).

A single element of a character vector is often referred to as a
character string or short string.
------------------------------------------------------------------------------

The "Writing R Extensions" Manual, even considerably more
important notably to package writers, mentions atomic vectors
also, e.g., here (again the 6 "R storage mode"s for "basic"
vectors in the  .C() interface

  https://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Interface-functions-_002eC-and-_002eFortran

and then for the more advanced, talking about the C API and
mentioning some of the C-level atomic vector functions, 
first  "for all the atomic vector types" and then notably
IsVectorAtomic() under "some convenience functions"

  https://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Some-convenience-functions

Similarly, Hadley Wickham's book,  "Advanced R", has treated the
topic of  "atomic vectors", currently in
  https://adv-r.hadley.nz/vectors-chap.html#atomic-vectors

-------------------------------------

For historical reasons, R being created (~1993 ff) to be mostly S-compatible,
the  is.atomic() function has also been made compatible
which meant that it gave not only  TRUE  for the 6 atomic vector
types but also for NULL.
At the times, much of the S code (and originally quite a bit
more of the R code than now) treated   NULL
to mean "any vector of size 0"  and probably for that reason, it
was used as preliminary / expressive shortcut of  numeric(0),
logical(0), etc, it was convenient if   is.atomic(NULL)  gave TRUE
the same as it did for all numeric(), integer(), logical(),
.... atomic vectors.

But many time has passed and we had contemplated for many months
if not several years that we'd try to make  is.atomic()  behave
according to the language definition of an atomic-vector.

For this reason, we plan that next year's release of R will have

  > is.atomic(NULL)
  [1] FALSE

instead of  TRUE  as now and historically.

Some package maintainers have been alerted, some as early as 19
months ago  (Feb. 2022), and others a few hours ago that the
above will happen.

Our current plan means it will happen already within the next
few days *if* you are working with the very latest development
versions of R, often called  "R-devel" (as this mailing list).

This will also mean that package maintainers who check their
packages with R-devel  (Automated CRAN jobs do this on an
almost daily schedule; Bioconductor will do this very soon if
they have not already started, and the same could happen if you
use R Hub, CI tools or docker versions of "R-devel".

In all cases where code starts working differently than
previously you could replace

	    is.atomic(<foo>)
by	   (is.atomic(<foo>) || is.null(<foo>))

which will have the R code working equivalently in older and
very new / future versions of R.

Often times however such a change is unnecessary (and even
"wrong" in principle) namely  whenever "you" (or the person who
wrote the code you are working with) really *meant* to check for
atomic vectors which indeed do *not* include NULL.

In other cases, it may be more readable and also better code to
replace code of the form

  if( is.atomic(<foo>) ) {
     .... deal with both NULL and truly-atomic cases ...
     .... deal with both NULL and truly-atomic cases ...
     .... deal with both NULL and truly-atomic cases ...
  }     

by

  if( is.null(<foo>) ) {
     .... deal with NULL case ....
  } else if( is.atomic(<foo>) ) {
     .... deal with truly-atomic case ...
     .... deal with truly-atomic case ...
  }
  
Again, such re-writing makes sense and may improve the quality
and even efficiency of your code, already for current versions
of R and will "automatically" continue to work correctly in
future versions of R where  is.atomic(NULL)  will no longer be
true.

We hope this will help programming safe-ness *and*
make learning and teaching of R more consistent.

Enjoy using R!
Martin

--
Martin Maechler
ETH Zurich  and  R Core team



More information about the R-devel mailing list