[Rd] Should last default to .Machine$integer.max-1 for substring()

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Mon Jun 21 09:35:39 CEST 2021


>>>>> Michael Chirico 
>>>>>     on Sun, 20 Jun 2021 15:20:26 -0700 writes:

    > Currently, substring defaults to last=1000000L, which
    > strongly suggests the intent is to default to "nchar(x)"
    > without having to compute/allocate that up front.

    > Unfortunately, this default makes no sense for "very
    > large" strings which may exceed 1000000L in "width".

Yes;  and I tend to agree with you that this default is outdated
(Remember :  R was written to work and run on 2 (or 4?) MB of RAM on the
 student lab  Macs in Auckland in ca 1994).

    > The max width of a string is .Machine$integer.max-1:

  (which Brodie showed was only almost true)

    > So it seems to me either .Machine$integer.max or
    > .Machine$integer.max-1L would be a more sensible default. Am I missing
    > something?

The "drawback" is of course that .Machine$integer.max  is still
a function call (as R beginners may forget) contrary to <nnnnn>L,
but that may even be inlined by the byte compiler (? how would we check ?)
and even if it's not, it does more clearly convey the concept
and idea  *and* would probably even port automatically if ever
integer would be increased in R.

Martin



More information about the R-devel mailing list