[Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1
Henrik Bengtsson
henrik@bengt@@on @ending from gm@il@com
Sat Sep 1 02:24:32 CEST 2018
Thanks all for a great discussion.
I think we can introduce assertions for length(x) <= 1 (and produce a
warning/error if not) without changing the value of these &&/||
expressions.
In R 3.4.0, '_R_CHECK_LENGTH_1_CONDITION_=true' was introduced to turn
warnings on "the condition has length > 1 and only the first element
will be used" in cases like 'if (c(TRUE, TRUE)) 42' into errors. The
idea is to later make '_R_CHECK_LENGTH_1_CONDITION_=true' the new
default. I guess, someday this will always produce an error.
Similarly, the test for this &&/|| issue could be controlled by
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=warn' and
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=err' and possibly have
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=true' default to 'warn' and later
'err'.
Changing the behavior of cases where length(x) == 0 is more likely to
break *some* code out there, and might require a separate
discussion/set of validations. It's not unlikely that someone
actually relied on this to resolve to NA. BTW, since it hasn't been
explicitly said, it's "logical" that we have TRUE && logical(0)
resolving to NA, because it currently behaves as TRUE[1] &&
logical(0)[1], which resolves to TRUE && NA => NA. If a decision on
the zero-length case would delay fixing the length(x) > 1 case, I
would postpone the decision on the former.
/Henrik
On Fri, Aug 31, 2018 at 2:48 AM Emil Bode <emil.bode using dans.knaw.nl> wrote:
>
>
> On 30/08/2018, 20:15, "R-devel on behalf of Hadley Wickham" <r-devel-bounces using r-project.org on behalf of h.wickham using gmail.com> wrote:
>
> On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler
> <maechler using stat.math.ethz.ch> wrote:
> >
> > >>>>> Joris Meys
> > >>>>> on Thu, 30 Aug 2018 14:48:01 +0200 writes:
> >
> > > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
> > > <toth.denes using kogentum.hu> wrote:
> > >> Note that `||` and `&&` have never been symmetric:
> > >>
> > >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
> > >> error
> > >>
> > >>
> > > Fair point. So the suggestion would be to check whether x
> > > is of length 1 and whether y is of length 1 only when
> > > needed. I.e.
> >
> > > c(TRUE,FALSE) || TRUE
> >
> > > would give an error and
> >
> > > TRUE || c(TRUE, FALSE)
> >
> > > would pass.
> >
> > > Thought about it a bit more, and I can't come up with a
> > > use case where the first line must pass. So if the short
> > > circuiting remains and the extra check only gives a small
> > > performance penalty, adding the error could indeed make
> > > some bugs more obvious.
> >
> > I agree "in theory".
> > Thank you, Henrik, for bringing it up!
> >
> > In practice I think we should start having a warning signalled.
> > I have checked the source code in the mean time, and the check
> > is really very cheap
> > { because it can/should be done after checking isNumber(): so
> > then we know we have an atomic and can use XLENGTH() }
> >
> >
> > The 0-length case I don't think we should change as I do find
> > NA (is logical!) to be an appropriate logical answer.
>
> Can you explain your reasoning a bit more here? I'd like to understand
> the general principle, because from my perspective it's more
> parsimonious to say that the inputs to || and && must be length 1,
> rather than to say that inputs could be length 0 or length 1, and in
> the length 0 case they are replaced with NA.
>
> Hadley
>
> I would say the value NA would cause warnings later on, that are easy to track down, so a return of NA is far less likely to cause problems than an unintended TRUE or FALSE. And I guess there would be some code reliant on 'logical(0) || TRUE' returning TRUE, that wouldn't necessarily be a mistake.
>
> But I think it's hard to predict how exactly people are using functions. I personally can't imagine a situation where I'd use || or && outside an if-statement, so I'd rather have the current behaviour, because I'm not sure if I'm reliant on logical(0) || TRUE somewhere in my code (even though that would be ugly code, it's not wrong per se)
> But I could always rewrite it, so I believe it's more a question of how much would have to be rewritten. Maybe implement it first in devel, to see how many people would complain?
>
> Emil Bode
>
>
>
>
More information about the R-devel
mailing list