[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

Mon Sep 16 09:54:07 CEST 2019

Awesome. Gabe, since you already have a workshopped version, would you like
to proceed? Feel free to ping me to review the patch once it's posted.

On Mon, Sep 16, 2019 at 3:26 PM Martin Maechler <maechler using stat.math.ethz.ch>
wrote:

> >>>>> Michael Chirico
> >>>>>     on Sun, 15 Sep 2019 20:52:34 +0800 writes:
>
>     > Finally read in detail your response Gabe. Looks great,
>     > and I agree it's quite intuitive, as well as agree against
>     > non-recycling.
>
>     > Once the length(n) == length(dim(x)) behavior is enabled,
>     > I don't think there's any need/desire to have head() do
>     > x[1:6,1:6] anymore. head(x, c(6, 6)) is quite clear for
>     > those familiar with head(x, 6), it would seem to me.
>
>     > Mike C
>
> Thank you, Gabe, and Michael.
> I did like Gabe's proposal already back in July but was
> busy and/or vacationing then ...
>
> If you submit this with a patch (that includes changes to both
> *.R and *.Rd , including some example) as "wishlist" item to R's
> bugzilla, I'm willing/happy to check and commit this to R-devel.
>
> Martin
>
>
>     > On Sat, Jul 13, 2019 at 8:35 AM Gabriel Becker
>     > <gabembecker using gmail.com> wrote:
>
>     >> Hi Michael and Abby,
>     >>
>     >> So one thing that could happen that would be backwards
>     >> compatible (with the exception of something that was an
>     >> error no longer being an error) is head and tail could
>     >> take vectors of length (dim(x)) rather than integers of
>     >> length for n, with the default being n=6 being equivalent
>     >> to n = c(6, dim(x)[2], <...>, dim(x)[k]), at least for
>     >> the deprecation cycle, if not permanently. It not
>     >> recycling would be unexpected based on the behavior of
>     >> many R functions but would preserve the current behavior
>     >> while granting more fine-grained control to users that
>     >> feel they need it.
>     >>
>     >> A rapidly thrown-together prototype of such a method for
>     >> the head of a matrix case is as follows:
>     >>
>     >> head2 = function(x, n = 6L, ...) { indvecs =
>     >> lapply(seq_along(dim(x)), function(i) { if(length(n) >=
>     >> i) { ni = n[i] } else { ni = dim(x)[i] } if(ni < 0L) ni =
>     >> max(nrow(x) + ni, 0L) else ni = min(ni, dim(x)[i])
>     >> seq_len(ni) }) lstargs = c(list(x),indvecs, drop = FALSE)
>     >> do.call("[", lstargs) }
>     >>
>     >>
>     >> > mat = matrix(1:100, 10, 10)
>     >>
>     >> > *head(mat)*
>     >>
>     >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>     >>
>     >> [1,] 1 11 21 31 41 51 61 71 81 91
>     >>
>     >> [2,] 2 12 22 32 42 52 62 72 82 92
>     >>
>     >> [3,] 3 13 23 33 43 53 63 73 83 93
>     >>
>     >> [4,] 4 14 24 34 44 54 64 74 84 94
>     >>
>     >> [5,] 5 15 25 35 45 55 65 75 85 95
>     >>
>     >> [6,] 6 16 26 36 46 56 66 76 86 96
>     >>
>     >> > *head2(mat)*
>     >>
>     >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>     >>
>     >> [1,] 1 11 21 31 41 51 61 71 81 91
>     >>
>     >> [2,] 2 12 22 32 42 52 62 72 82 92
>     >>
>     >> [3,] 3 13 23 33 43 53 63 73 83 93
>     >>
>     >> [4,] 4 14 24 34 44 54 64 74 84 94
>     >>
>     >> [5,] 5 15 25 35 45 55 65 75 85 95
>     >>
>     >> [6,] 6 16 26 36 46 56 66 76 86 96
>     >>
>     >> > *head2(mat, c(2, 3))*
>     >>
>     >> [,1] [,2] [,3]
>     >>
>     >> [1,] 1 11 21
>     >>
>     >> [2,] 2 12 22
>     >>
>     >> > *head2(mat, c(2, -9))*
>     >>
>     >> [,1]
>     >>
>     >> [1,] 1
>     >>
>     >> [2,] 2
>     >>
>     >>
>     >> Now one thing to keep in mind here, is that I think we'd
>     >> either a) have to make the non-recycling behavior
>     >> permanent, or b) have head treat data.frames and matrices
>     >> different with respect to the subsets they grab (which
>     >> strikes me as a *Bad Plan *(tm)).
>     >>
>     >> So I don't think the default behavior would ever be
>     >> mat[1:6, 1:6], not because of backwards compatibility,
>     >> but because at least in my intuition that is just not
>     >> what head on a data.frame should do by default, and I
>     >> think the behaviors for the basic rectangular datatypes
>     >> should "stick together". I mean, also because of
>     >> backwards compatibility, but that could *in theory*
>     >> change across a long enough deprecation cycle, but the
>     >> conceptually right thing to do with a data.frame probably
>     >> won't.
>     >>
>     >> All of that said, is head(mat, c(6, 6)) really that much
>     >> easier to type/better than just mat[1:6, 1:6, drop=FALSE]
>     >> (I know this will behave differently if any of the dims
>     >> of mat are less than 6, but if so why are you heading it
>     >> in the first place ;) )? I don't really have a strong
>     >> feeling on the answer to that.
>     >>
>     >> I'm happy to put a patch for head.matrix,
>     >> head.data.frame, tail.matrix and tail.data.frame, plus
>     >> documentation, if people on R-core are interested in
>     >> this.
>     >>
>     >> Note, as most here probably know, and as alluded to
>     >> above, length(n) > 1 for head or tail currently give an
>     >> error, so this would be an extension of the existing
>     >> functionality in the mathematical extension sense, where
>     >> all existing behavior would remain identical, but the
>     >> support/valid parameter space would grow.
>     >>
>     >> Best, ~G
>     >>
>     >>
>     >> On Fri, Jul 12, 2019 at 4:03 PM Abby Spurdle
>     >> <spurdle.a using gmail.com> wrote:
>     >>
>     >>> > I assume there are lots of backwards-compatibility
>     >>> issues as well as valid > use cases for this behavior,
>     >>> so I guess defaulting to M[1:6, 1:6] is out of > the
>     >>> question.
>     >>>
>     >>> Agree.
>     >>>
>     >>> > Is there any scope for adding a new argument to
>     >>> head.matrix that would > allow this flexibility?
>     >>>
>     >>> I agree with what you're trying to achieve.  However,
>     >>> I'm not sure this is as simple as you're suggesting.
>     >>>
>     >>> What if the user wants "head" in rows but "tail" in
>     >>> columns.  Or "head" in rows, and both "head" and "tail"
>     >>> in columns.  With head and tail alone, there's a
>     >>> combinatorial explosion.
>     >>>
>     >>> Also, when using tail on an unnamed matrix, it may be
>     >>> desirable to name rows and columns.
>     >>>
>     >>> And all of this assumes standard matrix objects.  Add in
>     >>> a matrix subclasses and related objects, and things get
>     >>> more complex still.
>     >>>
>     >>> As I suggested in a another thread, a few days ago, I'm
>     >>> planning to write an R package for matrices and
>     >>> matrix-like objects (possibly extending the Matrix
>     >>> package), with an initial emphasis on subsetting,
>     >>> printing and formatting.  So, I'm interested to hear
>     >>> more suggestions on this topic.
>     >>>
>     >>> [[alternative HTML version deleted]]
>     >>>
>     >>> ______________________________________________
>     >>> R-devel using r-project.org mailing list
>     >>> https://stat.ethz.ch/mailman/listinfo/r-devel
>     >>>
>     >>
>
>     >   [[alternative HTML version deleted]]
>
>     > ______________________________________________
>     > R-devel using r-project.org mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]