[Rd] Shouldn't vector indexing with negative out-of-range index give an error?

Wed May 6 10:33:50 CEST 2015

>>>>> John Chambers <jmc at stat.stanford.edu>
>>>>>     on Tue, 5 May 2015 08:39:46 -0700 writes:

    > When someone suggests that we "might have had a reason" for some peculiarity in the original S, my usual reaction is "Or else we never thought of the problem".
    > In this case, however, there is a relevant statement in the 1988 "blue book".  In the discussion of subscripting (p 358) the definition for negative i says: "the indices consist of the elements of seq(along=x) that do not match any elements in -i".

    > Suggesting that no bounds checking on -i takes place.

    > John

Indeed!  
Thanks a lot John, for the perspective and clarification!

I'm committing a patch to the documentation now.
Martin

    > On May 5, 2015, at 7:01 AM, Martin Maechler <maechler at lynne.stat.math.ethz.ch> wrote:

    >>>>>>> Henrik Bengtsson <henrik.bengtsson at ucsf.edu>
    >>>>>>> on Mon, 4 May 2015 12:20:44 -0700 writes:
    >> 
    >>> In Section 'Indexing by vectors' of 'R Language Definition'
    >>> (http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors)
    >>> it says:
    >> 
    >>> "Integer. All elements of i must have the same sign. If they are
    >>> positive, the elements of x with those index numbers are selected. If
    >>> i contains negative elements, all elements except those indicated are
    >>> selected.
    >> 
    >>> If i is positive and exceeds length(x) then the corresponding
    >>> selection is NA. A negative out of bounds value for i causes an error.
    >> 
    >>> A special case is the zero index, which has null effects: x[0] is an
    >>> empty vector and otherwise including zeros among positive or negative
    >>> indices has the same effect as if they were omitted."
    >> 
    >>> However, that "A negative out of bounds value for i causes an error"
    >>> in the second paragraph does not seem to apply.  Instead, R silently
    >>> ignore negative indices that are out of range.  For example:
    >> 
    >>>> x <- 1:4
    >>>> x[-9L]
    >>> [1] 1 2 3 4
    >>>> x[-c(1:9)]
    >>> integer(0)
    >>>> x[-c(3:9)]
    >>> [1] 1 2
    >> 
    >>>> y <- as.list(1:4)
    >>>> y[-c(1:9)]
    >>> list()
    >> 
    >>> Is the observed non-error the correct behavior and therefore the
    >>> documentation is incorrect, or is it vice verse?  (...or is it me
    >>> missing something)
    >> 
    >>> I get the above on R devel, R 3.2.0, and as far back as R 2.11.0
    >>> (haven't check earlier versions).
    >> 
    >> Thank you, Henrik!
    >> 
    >> I've checked further back: The change happened between R 2.5.1 and R 2.6.0.
    >> 
    >> The previous behavior was
    >> 
    >>> (1:3)[-(3:5)]
    >> Error: subscript out of bounds
    >> 
    >> If you start reading NEWS.2, you see a *lot* of new features
    >> (and bug fixes) in the 2.6.0 news, but from my browsing, none of
    >> them mentioned the new behavior as feature.
    >> 
    >> Let's -- for a moment -- declare it a bug in the code, i.e., not
    >> in the documentation:
    >> 
    >> - As 2.6.0  happened quite a while ago (Oct. 2007),  
    >> we could wonder how much R code will break if we fix the bug.
    >> 
    >> - Is the R package authors' community willing to do the necessary
    >> cleanup in their packages ?
    >> 
    >> ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 
    >> 
    >> 
    >> Now, after reading the source code for a while, and looking at
    >> the changes, I've found the log entry
    >> 
    >> ------------------------------------------------------------------------
    >> r42123 | ihaka | 2007-07-05 02:00:05 +0200 (Thu, 05 Jul 2007) | 4 lines
    >> 
    >> Changed the behaviour of out-of-bounds negative
    >> subscripts to match that of S.  Such values are
    >> now ignored rather than tripping an error.
    >> 
    >> ------------------------------------------------------------------------
    >> 
    >> So, it was changed on purpose, by one of the true "R"s, very
    >> much on purpose.
    >> 
    >> Making it a *warning* instead of the original error
    >> may have been both more cautious and more helpful for
    >> detecting programming errors.
    >> 
    >> OTOH, John Chambers, the father of S and hence grandfather of R,
    >> may have had good reasons why it seemed more logical to silently
    >> ignore such out of bound negative indices:
    >> One could argue that
    >> 
    >> x[-5]  means  "leave away the 5-th element of x"
    >> 
    >> and if there is no 5-th element of x, leaving it away should be a no-op.
    >> 
    >> After all this musing and history detection, my gut decision
    >> would be to only change the documentation which Ross forgot to change.
    >> 
    >> But of course, it may be interesting to hear other programmeR's feedback on this.
    >> 
    >> Martin