(PR#7824) [Rd] handling of zero and negative indices in
ripley at stats.ox.ac.uk
ripley at stats.ox.ac.uk
Fri May 6 14:43:57 CEST 2005
I've put this in (with some different wording). Although S blithely
accepts mis-dimensioned index matrices I agree this is wrong and have made
it an error.
On Fri, 29 Apr 2005 tplate at acm.org wrote:
> This message contains a description of what looks like a bug, examples
> of the suspect behavior, a proposed change to the C code to change this
> behavior, example of behavior with the fix, and suggestions for 3 places
> to update the documentation to reflect the proposed behavior. It is
> submitted for consideration for inclusion in R. Comments are requested.
>
> Currently, the code for subscripting by matrices checks that values in
> the matrix are not greater than the dimensions of the array being
> indexed. However, it does not check for zero or negative indices, and
> blindly does index computation with them as though they were positive
> indices (including for negative indices whose absolute value exceeds the
> dimensions of the array being indexed). Here are examples of indexing
> by matrices that do not return unequivocally sensible results (in most
> cases):
>
> > x <- matrix(1:6,ncol=2)
> > dim(x)
> [1] 3 2
> > x
> [,1] [,2]
> [1,] 1 4
> [2,] 2 5
> [3,] 3 6
> > x[rbind(c(1,1), c(2,2))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,1))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,0))]
> Error: only 0's may be mixed with negative subscripts
> > x[rbind(c(1,1), c(2,2), c(0,2))]
> [1] 1 5 3
> > x[rbind(c(1,1), c(2,2), c(0,3))]
> Error: subscript out of bounds
> > x[rbind(c(1,1), c(2,2), c(1,0))]
> Error: only 0's may be mixed with negative subscripts
> > x[rbind(c(1,1), c(2,2), c(2,0))]
> Error: only 0's may be mixed with negative subscripts
> > x[rbind(c(1,1), c(2,2), c(3,0))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(1,2))]
> [1] 1 5 4
> > x[rbind(c(1,1), c(2,2), c(-1,2))]
> [1] 1 5 2
> > x[rbind(c(1,1), c(2,2), c(-2,2))]
> [1] 1 5 1
> > x[rbind(c(1,1), c(2,2), c(-3,2))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(-4,2))]
> Error: only 0's may be mixed with negative subscripts
> > x[rbind(c(1,1), c(2,2), c(-1,-1))]
> Error: subscript out of bounds
> >
> > # range checks are not applied to negative indices
> > x <- matrix(1:6, ncol=3)
> > dim(x)
> [1] 2 3
> > x[rbind(c(1,1), c(2,2), c(-3,3))]
> [1] 1 4 1
> > x[rbind(c(1,1), c(2,2), c(-4,3))]
> [1] 1 4
> >
> > version
> _
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 1.0
> year 2005
> month 04
> day 18
> language R
> >
>
> The followed version of mat2indsub() (to replace the one in
> src/main/subscript.c) allows zero indices and explicitly disallows all
> negative indices:
>
> /* Special Matrix Subscripting: Handles the case x[i] where */
> /* x is an n-way array and i is a matrix with n columns. */
> /* This code returns a vector containing the integer subscripts */
> /* to be extracted when x is regarded as unravelled. */
> /* Negative indices are not allowed. */
> /* A zero anywhere in a row will cause a zero in the same */
> /* position in the result. */
>
> SEXP mat2indsub(SEXP dims, SEXP s)
> {
> int tdim, j, i, k, nrs = nrows(s);
> SEXP rvec;
>
> PROTECT(rvec = allocVector(INTSXP, nrs));
> s = coerceVector(s, INTSXP);
> setIVector(INTEGER(rvec), nrs, 0);
>
> for (i = 0; i < nrs; i++) {
> tdim = 1;
> /* compute 0-based subscripts for a row (0 in the input */
> /* gets -1 in the output here) */
> for (j = 0; j < LENGTH(dims); j++) {
> k = INTEGER(s)[i + j * nrs];
> if(k == NA_INTEGER) {
> INTEGER(rvec)[i] = NA_INTEGER;
> break;
> }
> if(k < 0)
> error(_("cannot have negative values in matrices used as subscripts"));
> if(k == 0) {
> INTEGER(rvec)[i] = -1;
> break;
> }
> if (k > INTEGER(dims)[j])
> error(_("subscript out of bounds"));
> INTEGER(rvec)[i] += (k - 1) * tdim;
> tdim *= INTEGER(dims)[j];
> }
> /* transform to 1 based subscripting (0 in the input */
> /* gets 0 in the output here) */
> if(INTEGER(rvec)[i] != NA_INTEGER)
> INTEGER(rvec)[i]++;
> }
> UNPROTECT(1);
> return (rvec);
> }
>
> With this version, the above commands (+ a couple more) produce the
> following output (with commands that fail suitably wrapped in try() so
> that they can be included in a test files.)
>
> > x <- matrix(1:6,ncol=2)
> > dim(x)
> [1] 3 2
> > x
> [,1] [,2]
> [1,] 1 4
> [2,] 2 5
> [3,] 3 6
> > x[rbind(c(1,1), c(2,2))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,1))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,0))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,2))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(0,3))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(1,0))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(2,0))]
> [1] 1 5
> > x[rbind(c(1,1), c(2,2), c(3,0))]
> [1] 1 5
> > x[rbind(c(1,0), c(0,2), c(3,0))]
> numeric(0)
> > x[rbind(c(1,0), c(0,0), c(3,0))]
> numeric(0)
> > x[rbind(c(1,1), c(2,2), c(1,2))]
> [1] 1 5 4
> > x[rbind(c(1,1), c(2,NA), c(1,2))]
> [1] 1 NA 4
> > x[rbind(c(1,0), c(2,NA), c(1,2))]
> [1] NA 4
> > try(x[rbind(c(1,1), c(2,2), c(-1,2))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-1, 2))]) :
> cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-2,2))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-2, 2))]) :
> cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-3,2))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-3, 2))]) :
> cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-4,2))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-4, 2))]) :
> cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-1,-1))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-1, -1))]) :
> cannot have negative values in matrices used as subscripts
> >
> > # verify that range checks are applied to negative indices
> > x <- matrix(1:6, ncol=3)
> > dim(x)
> [1] 2 3
> > try(x[rbind(c(1,1), c(2,2), c(-3,3))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-3, 3))]) :
> cannot have negative values in matrices used as subscripts
> > try(x[rbind(c(1,1), c(2,2), c(-4,3))])
> Error in try(x[rbind(c(1, 1), c(2, 2), c(-4, 3))]) :
> cannot have negative values in matrices used as subscripts
> >
>
>
> The documentation page for ?Extract could have the following sentences
> added to the end of the 2nd para in the description of arguments "i, j,
> ...":
>
> Negative indices are not allowed in matrices used as indices.
> NA and zero values are allowed: rows containing a zero are
> omitted from the result, and rows containing an NA produce an
> NA in the result.
>
> In "Introduction to R", the same material could be added to the end of
> the section "Index Arrays" (in the chapter "Arrays and matrices").
>
> In "R Language Definition", the third paragraph of the section "Indexing
> matrices and arrays" (Section 3.4.2 in my copy) currently reads:
>
>> It is possible to use a matrix of integers as an index. In
>> this case, the number of columns of the matrix should match
>> the number of dimensions of the structure, and the result
>> will be a vector with length as the number of rows of the
>> matrix. The following example shows how to extract the
>> elements m[1, 1] and m[2, 2] in one operation.
>
> It could be changed to the following:
>
> It is possible to use a matrix of integers as an index.
> Negative indices are not allowed in matrices used as
> indices. NA and zero values are allowed: rows containing a
> zero are omitted from the result, and rows containing an NA
> produce an NA in the result. To use a matrix as an index,
> the number of columns of the matrix should match the number
> of dimensions of the structure, and the result will be a
> vector with length as the number of rows of the matrix
> (except when the matrix contains zeros). The following
> example shows how to extract the elements m[1, 1] and m[2,
> 2] in one operation.
>
>
> Additionally, indexing by matrix just goes ahead and uses matrix
> elements as vector indices in cases where the number of columns in the
> matrix does not match the number of dimensions in the array. E.g.:
>
> > x <- matrix(1:6,ncol=2)
> > x[rbind(c(1,1,1), c(2,2,2))]
> [1] 1 2 1 2 1 2
> >
>
> Would it not be preferable behavior to stop with an error in such cases?
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list