[Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Thu Aug 30 14:03:55 CEST 2018

Hi,

I absolutely second Henrik's suggestion.

On 08/30/2018 01:09 PM, Emil Bode wrote:
> I have to disagree, I think one of the advantages of '||' (or &&) is the lazy evaluation, i.e. you can use the first condition to "not care" about the second (and stop errors from being thrown).

I do not think Henrik's proposal implies that both arguments of `||` or 
`&&` should be evaluated before the evaluation of the condition. It 
implies that if an argument is evaluated, and its length does not equal 
one, it should return an error instead of the silent truncation of the 
argument.
So your argument is orthogonal to the issue.

> So if I want to check if x is a length-one numeric with value a value between 0 and 1, I can do 'class(x)=='numeric' && length(x)==1 && x>0 && x<1'.
> In your proposal, having x=c(1,2) would throw an error or multiple warnings.
> Also code that relies on the second argument not being evaluated would break, as we need to evaluate y in order to know length(y)
> There may be some benefit in checking for length(x) only, though that could also cause some false positives (e.g. 'x==-1 || length(x)==0' would be a bit ugly, but not necessarily wrong, same for someone too lazy to write x[1] instead of x).
> 
> And I don’t really see the advantage. The casting to length one is (I think), a feature, not a bug. If I have/need a length one x, and a length one y, why not use '|' and '&'? I have to admit I only use them in if-statements, and if I need an error to be thrown when x and y are not length one, I can use the shorter versions and then the if throws a warning (or an error for a length-0 or NA result).
> 
> I get it that for someone just starting in R, the differences between | and || can be confusing, but I guess that's just the price to pay for having a vectorized language.

I use R for about 10 years, and use regularly `||` and `&&` for the 
standard purpose (implemented in most programming languages for the same 
purpose, that is, no evaluation of all arguments if it is not required 
to decide whether the condition is TRUE). I can not recall any single 
case when I wanted to use them for the purpose to evaluate whether the 
*first* elements of vectors fulfill the given condition.

However, I regularly write mistakenly `||` or `&&` when I actually want 
to write `|` or `&`, and have no chance to spot the error because of the 
silent truncation of the arguments.

Regards,
Denes

> 
> Best regards,
> Emil Bode
>   
> Data-analyst
>   
> +31 6 43 83 89 33
> emil.bode using dans.knaw.nl
>   
> DANS: Netherlands Institute for Permanent Access to Digital Research Resources
> Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | info using dans.knaw.nl <mailto:info using dans.kn> | dans.knaw.nl <applewebdata://71F677F0-6872-45F3-A6C4-4972BF87185B/www.dans.knaw.nl>
> DANS is an institute of the Dutch Academy KNAW <http://knaw.nl/nl> and funding organisation NWO <http://www.nwo.nl/>.
> 
> On 29/08/2018, 05:03, "R-devel on behalf of Henrik Bengtsson" <r-devel-bounces using r-project.org on behalf of henrik.bengtsson using gmail.com> wrote:
> 
>      # Issue
>      
>      'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
>      using R 3.5.1),
>      
>      > c(TRUE, TRUE) || FALSE
>      [1] TRUE
>      > c(TRUE, FALSE) || FALSE
>      [1] TRUE
>      > c(TRUE, NA) || FALSE
>      [1] TRUE
>      > c(FALSE, TRUE) || FALSE
>      [1] FALSE
>      
>      This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
>      same) and it also applies to 'x && y'.
>      
>      Note also how the above truncation of 'x' is completely silent -
>      there's neither an error nor a warning being produced.
>      
>      
>      # Discussion/Suggestion
>      
>      Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
>      mistake.  Either the code is written assuming 'x' and 'y' are scalars,
>      or there is a coding error and vectorized versions 'x | y' and 'x & y'
>      were intended.  Should 'x || y' always be considered an mistake if
>      'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
>      or an error?  For instance,
>      '''r
>      > x <- c(TRUE, TRUE)
>      > y <- FALSE
>      > x || y
>      
>      Error in x || y : applying scalar operator || to non-scalar elements
>      Execution halted
>      
>      What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
>      'x || y' returns 'NA' in such cases, e.g.
>      
>      > logical(0) || c(FALSE, NA)
>      [1] NA
>      > logical(0) || logical(0)
>      [1] NA
>      > logical(0) && logical(0)
>      [1] NA
>      
>      I don't know the background for this behavior, but I'm sure there is
>      an argument behind that one.  Maybe it's simply that '||' and '&&'
>      should always return a scalar logical and neither TRUE nor FALSE can
>      be returned.
>      
>      /Henrik
>      
>      PS. This is in the same vein as
>      https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
>      - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
>      _R_CHECK_LENGTH_1_CONDITION_=true
>      
>      ______________________________________________
>      R-devel using r-project.org mailing list
>      https://stat.ethz.ch/mailman/listinfo/r-devel
>      
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>