[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Mon Sep 16 09:23:29 CEST 2019
>>>>> Michael Chirico
>>>>> on Sun, 15 Sep 2019 20:52:34 +0800 writes:
> Finally read in detail your response Gabe. Looks great,
> and I agree it's quite intuitive, as well as agree against
> non-recycling.
> Once the length(n) == length(dim(x)) behavior is enabled,
> I don't think there's any need/desire to have head() do
> x[1:6,1:6] anymore. head(x, c(6, 6)) is quite clear for
> those familiar with head(x, 6), it would seem to me.
> Mike C
Thank you, Gabe, and Michael.
I did like Gabe's proposal already back in July but was
busy and/or vacationing then ...
If you submit this with a patch (that includes changes to both
*.R and *.Rd , including some example) as "wishlist" item to R's
bugzilla, I'm willing/happy to check and commit this to R-devel.
Martin
> On Sat, Jul 13, 2019 at 8:35 AM Gabriel Becker
> <gabembecker using gmail.com> wrote:
>> Hi Michael and Abby,
>>
>> So one thing that could happen that would be backwards
>> compatible (with the exception of something that was an
>> error no longer being an error) is head and tail could
>> take vectors of length (dim(x)) rather than integers of
>> length for n, with the default being n=6 being equivalent
>> to n = c(6, dim(x)[2], <...>, dim(x)[k]), at least for
>> the deprecation cycle, if not permanently. It not
>> recycling would be unexpected based on the behavior of
>> many R functions but would preserve the current behavior
>> while granting more fine-grained control to users that
>> feel they need it.
>>
>> A rapidly thrown-together prototype of such a method for
>> the head of a matrix case is as follows:
>>
>> head2 = function(x, n = 6L, ...) { indvecs =
>> lapply(seq_along(dim(x)), function(i) { if(length(n) >=
>> i) { ni = n[i] } else { ni = dim(x)[i] } if(ni < 0L) ni =
>> max(nrow(x) + ni, 0L) else ni = min(ni, dim(x)[i])
>> seq_len(ni) }) lstargs = c(list(x),indvecs, drop = FALSE)
>> do.call("[", lstargs) }
>>
>>
>> > mat = matrix(1:100, 10, 10)
>>
>> > *head(mat)*
>>
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>>
>> [1,] 1 11 21 31 41 51 61 71 81 91
>>
>> [2,] 2 12 22 32 42 52 62 72 82 92
>>
>> [3,] 3 13 23 33 43 53 63 73 83 93
>>
>> [4,] 4 14 24 34 44 54 64 74 84 94
>>
>> [5,] 5 15 25 35 45 55 65 75 85 95
>>
>> [6,] 6 16 26 36 46 56 66 76 86 96
>>
>> > *head2(mat)*
>>
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>>
>> [1,] 1 11 21 31 41 51 61 71 81 91
>>
>> [2,] 2 12 22 32 42 52 62 72 82 92
>>
>> [3,] 3 13 23 33 43 53 63 73 83 93
>>
>> [4,] 4 14 24 34 44 54 64 74 84 94
>>
>> [5,] 5 15 25 35 45 55 65 75 85 95
>>
>> [6,] 6 16 26 36 46 56 66 76 86 96
>>
>> > *head2(mat, c(2, 3))*
>>
>> [,1] [,2] [,3]
>>
>> [1,] 1 11 21
>>
>> [2,] 2 12 22
>>
>> > *head2(mat, c(2, -9))*
>>
>> [,1]
>>
>> [1,] 1
>>
>> [2,] 2
>>
>>
>> Now one thing to keep in mind here, is that I think we'd
>> either a) have to make the non-recycling behavior
>> permanent, or b) have head treat data.frames and matrices
>> different with respect to the subsets they grab (which
>> strikes me as a *Bad Plan *(tm)).
>>
>> So I don't think the default behavior would ever be
>> mat[1:6, 1:6], not because of backwards compatibility,
>> but because at least in my intuition that is just not
>> what head on a data.frame should do by default, and I
>> think the behaviors for the basic rectangular datatypes
>> should "stick together". I mean, also because of
>> backwards compatibility, but that could *in theory*
>> change across a long enough deprecation cycle, but the
>> conceptually right thing to do with a data.frame probably
>> won't.
>>
>> All of that said, is head(mat, c(6, 6)) really that much
>> easier to type/better than just mat[1:6, 1:6, drop=FALSE]
>> (I know this will behave differently if any of the dims
>> of mat are less than 6, but if so why are you heading it
>> in the first place ;) )? I don't really have a strong
>> feeling on the answer to that.
>>
>> I'm happy to put a patch for head.matrix,
>> head.data.frame, tail.matrix and tail.data.frame, plus
>> documentation, if people on R-core are interested in
>> this.
>>
>> Note, as most here probably know, and as alluded to
>> above, length(n) > 1 for head or tail currently give an
>> error, so this would be an extension of the existing
>> functionality in the mathematical extension sense, where
>> all existing behavior would remain identical, but the
>> support/valid parameter space would grow.
>>
>> Best, ~G
>>
>>
>> On Fri, Jul 12, 2019 at 4:03 PM Abby Spurdle
>> <spurdle.a using gmail.com> wrote:
>>
>>> > I assume there are lots of backwards-compatibility
>>> issues as well as valid > use cases for this behavior,
>>> so I guess defaulting to M[1:6, 1:6] is out of > the
>>> question.
>>>
>>> Agree.
>>>
>>> > Is there any scope for adding a new argument to
>>> head.matrix that would > allow this flexibility?
>>>
>>> I agree with what you're trying to achieve. However,
>>> I'm not sure this is as simple as you're suggesting.
>>>
>>> What if the user wants "head" in rows but "tail" in
>>> columns. Or "head" in rows, and both "head" and "tail"
>>> in columns. With head and tail alone, there's a
>>> combinatorial explosion.
>>>
>>> Also, when using tail on an unnamed matrix, it may be
>>> desirable to name rows and columns.
>>>
>>> And all of this assumes standard matrix objects. Add in
>>> a matrix subclasses and related objects, and things get
>>> more complex still.
>>>
>>> As I suggested in a another thread, a few days ago, I'm
>>> planning to write an R package for matrices and
>>> matrix-like objects (possibly extending the Matrix
>>> package), with an initial emphasis on subsetting,
>>> printing and formatting. So, I'm interested to hear
>>> more suggestions on this topic.
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
> [[alternative HTML version deleted]]
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list