[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

Mon Sep 12 17:21:14 CEST 2016

>>>>> Radford Neal <radford at cs.toronto.edu>
>>>>>     on Fri, 9 Sep 2016 10:29:14 -0400 writes:

    >> Radford Nea:
    >> > So it may make more sense to move towards consistency in the
    >> > permissive direction, rather than the restrictive direction.
    >> 
    >> > That would mean allowing matrix(1,1,1) < (1:2), and maybe also things
    >> > like matrix(1,2,2)+(1:8).
    >> 
    >> Martin Maechler:
    >> That is an interesting idea.  Yes, in my view that would
    >> definitely also have to allow the latter, by the above argument
    >> of not treating the dim/dimnames attributes special.  For
    >> non-arrays length-1 is not treated much special apart from the
    >> fact that length-1 can always be recycled (without warning).

    > I think one could argue for allowing matrix(1,1,1)+(1:8) but not
    > matrix(1,2,2)+(1:8).  Length-1 vectors certainly are special in some
    > circumstances, being R's only way of representing a scalar.  For
    > instance, if (c(T,F)) gives a warning.

well, the if(.)  situation is very special and does not weigh
much for me, here.

    > This really goes back to what I think may have been a basic mistake in
    > the design of S, in deciding that everything is a vector, then halfway
    > modifying this with dim attributes, but it's too late to totally undo
    > that (though allowing a 0-length dim attribute to explicitly mark a
    > length-1 vector as a scalar might help).

(yes; I think there are also other ideas of adding true small
 scalars to R... I am not familiar with those, and in any case
 that should be a completely different thread and not be
 discussed in this one)

    >> > And I think there would be some significant problems. In addition to
    >> > the 10-20+ packages that Martin expects to break, there could be quite
    >> > a bit of user code that would no longer work - scripts for analysing
    >> > data sets that used to work, but now don't with the latest version.
    >> 
    >> That's not true (at least for the cases above): They would give
    >> a strong warning

    > But isn't the intent to make it an error later?  So I assume we're
    > debating making it an error, not just a warning.  

Yes, that's correct.
But if we have a longish deprecation period (i.e. where there's
only a warning) all important code should have been adapted
before it turns to an error 
 (( unless for those people who are careless enough to "graciously"
    use something like suppressWarnings(...) in too many places )).

    > (Though I'm
    > generally opposed to such warnings anyway, unless they could somehow
    > be restricted to come up only for interactive uses, not from deep in a
    > program the user didn't write, making them totally mysterious...)

    >> *and* the  logic and relop versions of this, e.g.,
    >> matrix(TRUE,1) | c(TRUE,FALSE) ;  matrix(1,1) > 1:2,
    >> have always been an  error; so nothing would break there.

    > Yes, that wouldn't change the behaviour of old code, but if we're
    > aiming for consistency, it might make sense to get rid of that error,
    > allowing code like sum(a%*%b<c(10,20,30)) with a and b being vectors,
    > rather than forcing the programmer to write sum(c(a%*%b)<c(10,20,30)).

Yes, that would be another way for consistency... leading to
less problems in existing code.  As said earlier, getting
consistency by becoming "more lenient" instead of "more restrictive" 
is a good option in my view.

We would however have this somewhat special  length-1-array
exception in how arrays behave in binary OPs, and both the underlying C
code and the full documentation being/becoming slightly more complicated
rather than simpler,

OTOH we would remain back compatible (*) to S or at least S-plus
(as far as I know) and all earlier versions of R, here,
and that is valuable, too, I agree.

Nobody else has commented yet on this sub-thread ... not even
privately to me.  If that status does not change quite a bit,
I don't see enough incentive for changing (the current R-devel code).

Martin

--
(*) "back-compatible" in the sense that old code which "worked"
    would continue to work the same
    (but some old code that gave an error would no longer do so)

    >> Of course; that *was* the reason the very special treatment for arithmetic
    >> length-1 arrays had been introduced.  It is convenient.
    >> 
    >> However, *some* of the conveniences in S (and hence R) functions
    >> have been dangerous {and much more used, hence close to
    >> impossible to abolish, e.g., sample(x) when x  is numeric of length 1,

    > There's a difference between these two.  Giving an error when using a
    > 1x1 matrix as a scalar may detect some programming bugs, but not
    > giving an error doesn't introduce a bug.  Whereas sample(2:n) behaving
    > differently when n is 2 than when n is greater than 2 is itself a bug,
    > that the programmer has to consciously avoid by being aware of the quirk.

    > Radford Neal