[R] an issue about subsetting a vector
Richard O'Keefe
r@oknz @end|ng |rom gm@||@com
Mon Mar 25 10:11:25 CET 2024
For questions like this, go to the manual.
Visit cran.r-project.org
The nav-bar on the left includes
"Documentation/Manuals/FAQ/Contributed". Click on Manuals.
Look at "An Introduction to R" and click on the HTML link for the
current release
(this is the first link on the page)
Scroll down to see the table of contents.
2.7 Index vectors; selecting and modifying subsets of a data set
is the link you want. Click on it.
The way it works is that integer(0) is "A vector of positive integral
quantities".
The relevant sentence is "The index vector can be of any length and
the result is of the same length as the index vector."
"Any length" here includes 0.
Your example would be a bit clearer if the elements of a[] were not integers.
So let's make them strings.
> a <- LETTERS[1:5]
> a
[1] "A" "B" "C" "D" "E"
> a[integer(0)]
character(0)
So selecting no elements from a vector of strings gives us a vector
with no strings
(rather than a vector with no integers).
-integer(0)
makes perfect sense and does exactly what anyone should expect:
the result and operand of unary - have the same kind of elements and the
same length and corresponding elements (of which there happen in this
case to be none) are negatives of each other. So -integer(0) is predictably
the same as integer(0), and a[-integer(0)] is the same as a[integer(0)].
The key point is clarified in section 2.7; _[- _] is NOT special syntax.
What matters in vector[index] is what the *elements* of index are, not
what the syntax of the expression happens to be.
The tricky thing here is that integer(0) is logically BOTH a vector of positive
integer quantities AND a vector of negative integer quantities and the fact that
S (and following it, R) prefers the first interpretation in this
context is just a
fact about S (and following it, R).
It gets even trickier when you start including index value 0 into a vector of
otherwise positive (negative) integers, where in effect the zeros are
ignored.
> a[c(0,0,0,0,0,0)]
character(0)
> a[-c(0,0,0,0,0,0)]
character(0)
On Mon, 25 Mar 2024 at 07:31, Ben Bolker <bbolker using gmail.com> wrote:
>
> As with a lot of things in R, the behaviour is counterintuitive but
> (arguably) logical. The key (maybe) is that `a[-x]` does not mean "take
> all of the elements of a except those indicated by `x`, but rather,
> "negate x, then take all but the negative elements". In other words,
>
> -integer(0) is integer(0) (we multiply *each of the elements of
> integer(0)* by -1, but integer(0) has no elements)
>
> that reduces to a[integer(0)]
>
> and from there you get "select none of the elements".
>
> A related point is explained in the R Inferno
> https://www.burns-stat.com/pages/Tutor/R_inferno.pdf section 8.1.13,
> "negative nothing is something".
>
> See also
>
> https://stackoverflow.com/questions/42615728/subsetting-vector-how-to-programatically-pass-negative-index-safely
>
> https://stackoverflow.com/questions/40026975/subsetting-with-negative-indices-best-practices/40029485#40029485
>
> On 2024-03-24 2:19 p.m., Paulo Barata wrote:
> >
> > To the R-Help list,
> >
> > I would like to have a clarification about an issue that I observe when
> > subsetting a vector. This is R version 4.3.3 on Windows 10.
> >
> > -- Example with a vector:
> >
> > > a <- 1:5
> > > a
> > [1] 1 2 3 4 5
> >
> > So we have, with a negative index:
> > > a[-3]
> > [1] 1 2 4 5
> >
> > But when the index is integer(0), we have this:
> >
> > > a[integer(0)]
> > integer(0)
> > > a[-integer(0)]
> > integer(0)
> >
> > When it comes to the function integer(), the R Help file says:
> >
> > "Value: integer creates a integer vector of the specified length. Each
> > element of the vector is equal to 0."
> >
> > But we see below that integer(0) is some kind of null vector, that is,
> > no numbers are represented by integer(0):
> >
> > > class(integer(0))
> > [1] "integer"
> > > length(integer(0))
> > [1] 0
> >
> > So my question: in the expression a[-integer(0)], what happens exactly
> > with the index -integer(0)? We see that integer(0), although of class
> > integer, does not represent any numbers, as its length is zero, so it
> > seems to me that it makes no sense to calculate its negative
> > -integer(0). What exactly is -integer(0)?
> >
> > In particular, why a[-integer(0)] is not the whole vector a, that is,
> > [1] 1 2 3 4 5? In the example below, if the invalid index -99 is
> > presented to a, the result is the whole vector:
> >
> > > a[-99]
> > [1] 1 2 3 4 5
> >
> > If -integer(0) is an invalid index, why do we have this?
> >
> > > a[-integer(0)]
> > integer(0)
> >
> > Why a[-integer(0)] is not the whole vector a?
> >
> > Thank you very much.
> >
> > Paulo Barata
> >
> > Rio de Janeiro, Brazil
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list