[R] do.call("+", ...)

Fri Nov 17 16:02:39 CET 2006

On 11/17/2006 9:15 AM, Robin Hankin wrote:
> Thanks for this Duncan
> 
>>>
> 
> 
>>> Why is Peter Dalgaard's suggestion necessary?  Why can't  "+"
>>> take more than two arguments?
>>
>> One reason is that it's a binary operator, so that's all it needs to
>> take.  We have the sum function for multiple operands.
>>
>> I would guess the historical reason is so that it can share code with
>> other binary operators.  For example, + currently shares code with the
>> other binary operators -, *, /, ^, %%, %/%, but the grouping needed
>> varies between them:  a-b-c == (a-b)-c, but a^b^c == a^(b^c).  R lets
>> the parser handle operator binding.
>>
> 
> 
> OK, I see.  But in algebra the  "+" symbol is special: it is reserved
> exclusively for associative and commutative operations [thus a+b+c is
> always well-defined]; perhaps the parser could fall in with this  
> convention?

Sure it could, but it would be some non-trivial amount of work to 
rewrite all the code that works now, and it's not clear that the gain 
would really be all that great.

(By the way, I would say that we should look to computer languages for 
precedent of the meaning of +, not to algebra.  Evidence that R is a 
computer language in case there are any doubters is the fact that one 
uses a * b for multiplication, rather than the algebraic convention  a b.

There are a number of computer languages that define + in a way that is 
not commutative, e.g. when it is used as the symbol for string 
concatenation:  "abc" + "def" == "abcdef".  I'd like R to do 
concatenation this way too, because it makes code a lot more readable 
than using paste("abc", "def", sep="").  And even now, "+" in R isn't 
associative, because of the limitations of the number systems it uses. 
Often the order of operations can have a large impact on the answer, e.g.

 > x <- as.integer(2^30)
 > y <- as.integer(2^30)
 > z <- as.integer(-1)
 > x + y + z
[1] NA
Warning message:
NAs produced by integer overflow in: x + y
 > x + (y + z)
[1] 2147483647

So I wouldn't accept that "+" should be reserved for associative and 
commutative operations.)

>> By the way, another complaint is that sum() is supposed to be generic,
>> but you can't define a sum.matrix() method so that sum(a,b,c) does the
>> same as a+b+c when a is a matrix.  (This would probably be a bad idea
>> because people may be relying on the current behaviour, but R tries  
>> not
>> to prevent people from testing out bad ideas.)  ...
> 
> 
> Can  just clarify this?   I can see that it's a bad idea, but I don't  
> quite
> see why one *can't* do it.  sum() is generic, and the manpage says
> that methods can be defined for it directly.

As Brian pointed out, the problem is that "matrix" is a class(), but not 
an oldClass().  Some generics only dispatch on oldClass(), because it 
would just be too slow if they had to check for methods on everything 
even when no class was specified.

Of course, this just leads to another question:  why don't we always 
attach class "matrix" to matrices so they do use the oldClass mechanism? 
  I'm not sure whether the answer to this is just because that's the way 
it happened and all old code expects things to be that way, or whether 
there are still good efficiency or other reasons for this.

Duncan Murdoch