[Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Sat Sep 1 02:24:32 CEST 2018

Thanks all for a great discussion.

I think we can introduce assertions for length(x) <= 1 (and produce a
warning/error if not) without changing the value of these &&/||
expressions.

In R 3.4.0, '_R_CHECK_LENGTH_1_CONDITION_=true' was introduced to turn
warnings on "the condition has length > 1 and only the first element
will be used" in cases like 'if (c(TRUE, TRUE)) 42'  into errors.  The
idea is to later make '_R_CHECK_LENGTH_1_CONDITION_=true' the new
default.  I guess, someday this will always produce an error.

Similarly, the test for this &&/|| issue could be controlled by
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=warn' and
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=err' and possibly have
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=true' default to 'warn' and later
'err'.

Changing the behavior of cases where length(x) == 0 is more likely to
break *some* code out there, and might require a separate
discussion/set of validations.  It's not unlikely that someone
actually relied on this to resolve to NA.  BTW, since it hasn't been
explicitly said, it's "logical" that we have TRUE && logical(0)
resolving to NA, because it currently behaves as TRUE[1] &&
logical(0)[1], which resolves to TRUE && NA => NA.  If a decision on
the zero-length case would delay fixing the length(x) > 1 case, I
would postpone the decision on the former.

/Henrik

On Fri, Aug 31, 2018 at 2:48 AM Emil Bode <emil.bode using dans.knaw.nl> wrote:
>
>
> On 30/08/2018, 20:15, "R-devel on behalf of Hadley Wickham" <r-devel-bounces using r-project.org on behalf of h.wickham using gmail.com> wrote:
>
>     On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler
>     <maechler using stat.math.ethz.ch> wrote:
>     >
>     > >>>>> Joris Meys
>     > >>>>>     on Thu, 30 Aug 2018 14:48:01 +0200 writes:
>     >
>     >     > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
>     >     > <toth.denes using kogentum.hu> wrote:
>     >     >> Note that `||` and `&&` have never been symmetric:
>     >     >>
>     >     >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
>     >     >> error
>     >     >>
>     >     >>
>     >     > Fair point. So the suggestion would be to check whether x
>     >     > is of length 1 and whether y is of length 1 only when
>     >     > needed. I.e.
>     >
>     >     > c(TRUE,FALSE) || TRUE
>     >
>     >     > would give an error and
>     >
>     >     > TRUE || c(TRUE, FALSE)
>     >
>     >     > would pass.
>     >
>     >     > Thought about it a bit more, and I can't come up with a
>     >     > use case where the first line must pass. So if the short
>     >     > circuiting remains and the extra check only gives a small
>     >     > performance penalty, adding the error could indeed make
>     >     > some bugs more obvious.
>     >
>     > I agree "in theory".
>     > Thank you, Henrik, for bringing it up!
>     >
>     > In practice I think we should start having a warning signalled.
>     > I have checked the source code in the mean time, and the check
>     > is really very cheap
>     > { because it can/should be done after checking isNumber(): so
>     >   then we know we have an atomic and can use XLENGTH() }
>     >
>     >
>     > The 0-length case I don't think we should change as I do find
>     > NA (is logical!) to be an appropriate logical answer.
>
>     Can you explain your reasoning a bit more here? I'd like to understand
>     the general principle, because from my perspective it's more
>     parsimonious to say that the inputs to || and && must be length 1,
>     rather than to say that inputs could be length 0 or length 1, and in
>     the length 0 case they are replaced with NA.
>
>     Hadley
>
> I would say the value NA would cause warnings later on, that are easy to track down, so a return of NA is far less likely to cause problems than an unintended TRUE or FALSE. And I guess there would be some code reliant on 'logical(0) || TRUE' returning TRUE, that wouldn't necessarily be a mistake.
>
> But I think it's hard to predict how exactly people are using functions. I personally can't imagine a situation where I'd use || or && outside an if-statement, so I'd rather have the current behaviour, because I'm not sure if I'm reliant on logical(0) || TRUE  somewhere in my code (even though that would be ugly code, it's not wrong per se)
> But I could always rewrite it, so I believe it's more a question of how much would have to be rewritten. Maybe implement it first in devel, to see how many people would complain?
>
> Emil Bode
>
>
>
>