[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
Michael Chirico
m|ch@e|ch|r|co4 @end|ng |rom gm@||@com
Mon Sep 16 09:54:07 CEST 2019
Awesome. Gabe, since you already have a workshopped version, would you like
to proceed? Feel free to ping me to review the patch once it's posted.
On Mon, Sep 16, 2019 at 3:26 PM Martin Maechler <maechler using stat.math.ethz.ch>
wrote:
> >>>>> Michael Chirico
> >>>>> on Sun, 15 Sep 2019 20:52:34 +0800 writes:
>
> > Finally read in detail your response Gabe. Looks great,
> > and I agree it's quite intuitive, as well as agree against
> > non-recycling.
>
> > Once the length(n) == length(dim(x)) behavior is enabled,
> > I don't think there's any need/desire to have head() do
> > x[1:6,1:6] anymore. head(x, c(6, 6)) is quite clear for
> > those familiar with head(x, 6), it would seem to me.
>
> > Mike C
>
> Thank you, Gabe, and Michael.
> I did like Gabe's proposal already back in July but was
> busy and/or vacationing then ...
>
> If you submit this with a patch (that includes changes to both
> *.R and *.Rd , including some example) as "wishlist" item to R's
> bugzilla, I'm willing/happy to check and commit this to R-devel.
>
> Martin
>
>
> > On Sat, Jul 13, 2019 at 8:35 AM Gabriel Becker
> > <gabembecker using gmail.com> wrote:
>
> >> Hi Michael and Abby,
> >>
> >> So one thing that could happen that would be backwards
> >> compatible (with the exception of something that was an
> >> error no longer being an error) is head and tail could
> >> take vectors of length (dim(x)) rather than integers of
> >> length for n, with the default being n=6 being equivalent
> >> to n = c(6, dim(x)[2], <...>, dim(x)[k]), at least for
> >> the deprecation cycle, if not permanently. It not
> >> recycling would be unexpected based on the behavior of
> >> many R functions but would preserve the current behavior
> >> while granting more fine-grained control to users that
> >> feel they need it.
> >>
> >> A rapidly thrown-together prototype of such a method for
> >> the head of a matrix case is as follows:
> >>
> >> head2 = function(x, n = 6L, ...) { indvecs =
> >> lapply(seq_along(dim(x)), function(i) { if(length(n) >=
> >> i) { ni = n[i] } else { ni = dim(x)[i] } if(ni < 0L) ni =
> >> max(nrow(x) + ni, 0L) else ni = min(ni, dim(x)[i])
> >> seq_len(ni) }) lstargs = c(list(x),indvecs, drop = FALSE)
> >> do.call("[", lstargs) }
> >>
> >>
> >> > mat = matrix(1:100, 10, 10)
> >>
> >> > *head(mat)*
> >>
> >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> >>
> >> [1,] 1 11 21 31 41 51 61 71 81 91
> >>
> >> [2,] 2 12 22 32 42 52 62 72 82 92
> >>
> >> [3,] 3 13 23 33 43 53 63 73 83 93
> >>
> >> [4,] 4 14 24 34 44 54 64 74 84 94
> >>
> >> [5,] 5 15 25 35 45 55 65 75 85 95
> >>
> >> [6,] 6 16 26 36 46 56 66 76 86 96
> >>
> >> > *head2(mat)*
> >>
> >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> >>
> >> [1,] 1 11 21 31 41 51 61 71 81 91
> >>
> >> [2,] 2 12 22 32 42 52 62 72 82 92
> >>
> >> [3,] 3 13 23 33 43 53 63 73 83 93
> >>
> >> [4,] 4 14 24 34 44 54 64 74 84 94
> >>
> >> [5,] 5 15 25 35 45 55 65 75 85 95
> >>
> >> [6,] 6 16 26 36 46 56 66 76 86 96
> >>
> >> > *head2(mat, c(2, 3))*
> >>
> >> [,1] [,2] [,3]
> >>
> >> [1,] 1 11 21
> >>
> >> [2,] 2 12 22
> >>
> >> > *head2(mat, c(2, -9))*
> >>
> >> [,1]
> >>
> >> [1,] 1
> >>
> >> [2,] 2
> >>
> >>
> >> Now one thing to keep in mind here, is that I think we'd
> >> either a) have to make the non-recycling behavior
> >> permanent, or b) have head treat data.frames and matrices
> >> different with respect to the subsets they grab (which
> >> strikes me as a *Bad Plan *(tm)).
> >>
> >> So I don't think the default behavior would ever be
> >> mat[1:6, 1:6], not because of backwards compatibility,
> >> but because at least in my intuition that is just not
> >> what head on a data.frame should do by default, and I
> >> think the behaviors for the basic rectangular datatypes
> >> should "stick together". I mean, also because of
> >> backwards compatibility, but that could *in theory*
> >> change across a long enough deprecation cycle, but the
> >> conceptually right thing to do with a data.frame probably
> >> won't.
> >>
> >> All of that said, is head(mat, c(6, 6)) really that much
> >> easier to type/better than just mat[1:6, 1:6, drop=FALSE]
> >> (I know this will behave differently if any of the dims
> >> of mat are less than 6, but if so why are you heading it
> >> in the first place ;) )? I don't really have a strong
> >> feeling on the answer to that.
> >>
> >> I'm happy to put a patch for head.matrix,
> >> head.data.frame, tail.matrix and tail.data.frame, plus
> >> documentation, if people on R-core are interested in
> >> this.
> >>
> >> Note, as most here probably know, and as alluded to
> >> above, length(n) > 1 for head or tail currently give an
> >> error, so this would be an extension of the existing
> >> functionality in the mathematical extension sense, where
> >> all existing behavior would remain identical, but the
> >> support/valid parameter space would grow.
> >>
> >> Best, ~G
> >>
> >>
> >> On Fri, Jul 12, 2019 at 4:03 PM Abby Spurdle
> >> <spurdle.a using gmail.com> wrote:
> >>
> >>> > I assume there are lots of backwards-compatibility
> >>> issues as well as valid > use cases for this behavior,
> >>> so I guess defaulting to M[1:6, 1:6] is out of > the
> >>> question.
> >>>
> >>> Agree.
> >>>
> >>> > Is there any scope for adding a new argument to
> >>> head.matrix that would > allow this flexibility?
> >>>
> >>> I agree with what you're trying to achieve. However,
> >>> I'm not sure this is as simple as you're suggesting.
> >>>
> >>> What if the user wants "head" in rows but "tail" in
> >>> columns. Or "head" in rows, and both "head" and "tail"
> >>> in columns. With head and tail alone, there's a
> >>> combinatorial explosion.
> >>>
> >>> Also, when using tail on an unnamed matrix, it may be
> >>> desirable to name rows and columns.
> >>>
> >>> And all of this assumes standard matrix objects. Add in
> >>> a matrix subclasses and related objects, and things get
> >>> more complex still.
> >>>
> >>> As I suggested in a another thread, a few days ago, I'm
> >>> planning to write an R package for matrices and
> >>> matrix-like objects (possibly extending the Matrix
> >>> package), with an initial emphasis on subsetting,
> >>> printing and formatting. So, I'm interested to hear
> >>> more suggestions on this topic.
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-devel using r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>
>
> > [[alternative HTML version deleted]]
>
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list