[Rd] "+" for character method...

Martin Maechler maechler at stat.math.ethz.ch
Fri Aug 25 22:55:17 CEST 2006


>>>>> "Duncan" == Duncan Murdoch <murdoch at stats.uwo.ca>
>>>>>     on Fri, 25 Aug 2006 13:18:42 -0400 writes:

    Duncan> On 8/25/2006 12:31 PM, Martin Maechler wrote:
    >> This thread remains me of an old recurring (last May!)
    >> theme which maybe fits well to Friday late afternoon...
    >> 
    >> There have been propositions to make "+" work in S (and
    >> R) like in some other languages, namely for character
    >> (vectors),
    >> 
    >> a + b := paste(a,b, sep="")
    >> 
    >> IIRC, when this theme came up last, the one argument
    >> against it was the penalty of method dispatch that we
    >> were not willing to pay for something as fundamentally
    >> speed-important as "+" -- which is a .Primitive in R
    >> exactly for that reason of efficiency.
    >> 
    >> But then, we actually do dispatch for "+" -- internally
    >> in C code via DispatchGroup() --- but only if we need, so
    >> not when usual numeric/complex arguments are used.
    >> 
    >> I think - but may be wrong - it should be possible to
    >> also check very fast for two "character" arguments and in
    >> that case do a fast version of paste(a, b, sep="").

    Duncan> But for consistency shouldn't this work if only one
    Duncan> of the args is character, coercing the other to
    Duncan> character?  E.g. we have

    >> "2" > 10
    Duncan> [1] TRUE

yes.  But see also below

    >> When this last came up (in May), Brian said that about
    >> the fact that you could not just simply define
    >> "+.character"
    >> 
    >>>> I would think that the intention was also to positively
    >>>> discourage messing with the basics of R, as if you were
    >>>> able to do this erroneous uses would likely not get
    >>>> caught.
    >> (
    >> https://stat.ethz.ch/pipermail/r-help/2006-May/104751.html
    >> ) and subsequently
    >> (https://stat.ethz.ch/pipermail/r-help/2006-May/104754.html)
    >> gave an example for this
    >> 
    >>>> 2 + x, for example, where x is not numeric.

    Duncan> This is a valid concern, but I think the clarity
    Duncan> obtained by coding paste operations using + is worth
    Duncan> it.

    Duncan> For example, the first instance of paste(a, b,
    Duncan> sep="") I see in the source is

    Duncan> is.ALL(structure(1:7, names = paste("a",1:7,sep="")))

    Duncan> in base/demo/is.things.R

    Duncan> which I find clearer as

    Duncan> is.ALL(structure(1:7, names = "a" + 1:7))


    Duncan> But then I'm used to using + for strings from
    Duncan> Borland's Pascal extensions; to a C-speaker the
    Duncan> meaning may not be so obvious.

yes.  I think however if we keep speed and clarity and catching
user errors all in mind, it would be enough - and better - to
only dispatch to paste(.,.) when both arguments are character
(vectors), i.e., the above case needed  
 "a" + as.character(1:7) or "a" + paste(1:7) or "a" + format(1:7)
which after all is really more clearer, even more for cases of
 "1" + 2  which I'd rather want keeping to give errors.

If  Char + Num  should work like above, then also 
    Num + Char  should (since after all,  "+" should be commutative 
			apart from floating point precision issues).

and so the internal C code gets a bit more complicated and slightly
slower..  something we had in mind we should strongly avoid...

Martin

    >> I wonder however, if we do this in C, and basically only
    >> go into the paste-branch when both arguments are
    >> characters, if we wouldn't get to a nice useful solution
    >> without a noticable performance penalty.
    >> 
    >> This would also solve my other slight related uneasyness
    >> : Many times in the past, when using paste(..., sep='')
    >> in function definitions I had wanted this (empty sep) to
    >> be the default and to have an easier, more readable way
    >> to achieve the same.
    >> 
    >> But then these all are just musings at the end of the
    >> week...
    >> 
    >> Martin Maechler, ETH Zurich
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel

    Duncan> ______________________________________________
    Duncan> R-devel at r-project.org mailing list
    Duncan> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list