[Rd] head.matrix can return 1000s of columns ..
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Thu Nov 28 15:30:51 CET 2019
>>>>> Gabriel Becker
>>>>> on Sat, 2 Nov 2019 12:40:16 -0700 writes:
[....................]
In the mean time, Gabe had worked quite a bit and provided a
patch proposal at R's bugzilla, PR#17652 ,
i.e., here
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17652
A few days ago, I had committed a (slightly simplified) version
of that to R-devel (svn rev 77462 )
with NEWS entry
* head(x, n) and tail() default and other S3 methods notably for
_vector_ n, e.g. to get a "corner" of a matrix, also extended for
array's of higher dimension, thanks to the patch proposal by Gabe
Becker in PR#16764.
(which contains a *wrong* PR number that I've corrected in the
mean time)
A day or so later, the CRAN has alerted me to the fact that this
change breaks the checks of some CRAN packages, as it seems
about 30 now.
There were at least two principal reasons, one of which was the
fact that data frame subsetting has been somewhat surprising in R,
without being documented so, *and* some packages have
inadvertently made use of this pecularity -- which was
inadvertently changed by r77462.
In short, head(<data frame>) kept extraneous attributes
because indeed
d[i, ]
keeps those attributes ... for data frames.
I will amend the head() and tail() methods to remain back
compatible (as much as sensible) for now, but here's what I've
found about subsetting, i.e., behavior of the (partly C code
internal) `[` methods in R :
1) For a data frame d, d[i, ] differs from d[i,j],
as the former keeps (extra) attributes,
2) For a matrix both forms of indexing do not keep (extra) attributes.
Here's some simple reproducible R code exhibiting the claim:
##==== Data frame subsetting (vs. matrix, array) "with extra attributes": =====
## data frame w/ a (non-standard) attribute:
str(treeS <- structure(trees, foo = "bar"))
chkMat <- function(M) {
stopifnot(nzchar(Mfoo <- attr(M, "foo")),
length(d <- dim(M)) == 2,
(n <- d[1]) >= 6, d[2] >= 3)
## n = nrow(M)
stopifnot(exprs = { # attribute is kept
if(inherits(M, "data.frame")) {
identical( attr(M[ 1:3 , ] , "foo") , "bar") &&
identical( attr(M[(n-2):n , ] , "foo") , "bar")
} else { ## matrix
is.null ( attr(M[ 1:3 , ] , "foo")) &&
is.null ( attr(M[(n-2):n , ] , "foo"))
}
## OTOH, [i,j]-indexing of data frames *does* drop "other" attributes:
inherits(print(t.ij <- M[(n-2):n, 2:3] ), class(M))
## now, the "foo" attribute of M[i,j] is gone!
is.null(attr(t.ij, "foo"))
})
}
chkMat(treeS)
chkMat(as.matrix(treeS))
-------
And (to repeat), currently head(d, n) is the same as d[1:n , ]
when n >= 1, length(n) == 1 and this equality is relied upon
by CRAN package code out there .. and hence I'll keep it with
the "generalized" head() & tail() in R-devel.
Martin
More information about the R-devel
mailing list