[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
Pages, Herve
hp@ge@ @end|ng |rom |redhutch@org
Tue Sep 17 08:52:16 CEST 2019
Hi,
Alternatively, how about a new glance() generic that would do something
like this:
> library(DelayedArray)
> glance <- DelayedArray:::show_compact_array
> M <- matrix(rnorm(1e6), nrow = 1000L, ncol = 2000L)
> glance(M)
<1000 x 2000> matrix object of type "double":
[,1] [,2] [,3] ... [,1999] [,2000]
[1,] -0.8854896 1.8010288 1.3051341 . -0.4473593 0.4684985
[2,] -0.8563415 -0.7102768 -0.9309155 . -1.8743504 0.4300557
[3,] 1.0558159 -0.5956583 1.2689806 . 2.7292249 0.2608300
[4,] 0.7547356 0.1465714 0.1798959 . -0.1778017 1.3417423
[5,] 0.8037360 -2.7081809 0.9766657 . -0.9902788 0.1741957
... . . . . . .
[996,] 0.67220752 0.07804320 -0.38743454 . 0.4438639 -0.8130713
[997,] -0.67349962 -1.15292067 -0.54505567 . 0.4630923 -1.6287694
[998,] 0.03374595 -1.68061325 -0.88458368 . -0.2890962 0.2552267
[999,] 0.47861492 1.25530912 0.19436708 . -0.5193121 -1.1695501
[1000,] 1.52819218 2.23253275 -1.22051720 . -1.0342430 -0.1703396
> A <- array(rnorm(1e6), c(50, 20, 10, 100))
> glance(A)
<50 x 20 x 10 x 100> array object of type "double":
,,1,1
[,1] [,2] [,3] ... [,19] [,20]
[1,] 0.78319619 0.82258390 0.09122269 . 1.7288189 0.7968574
[2,] 2.80687459 0.63709640 0.80844430 . -0.3963161 -1.2768284
... . . . . . .
[49,] -1.0696320 -0.1698111 2.0082890 . 0.4488292 0.5215745
[50,] -0.7012526 -2.0818229 0.7750518 . 0.3189076 0.1437394
...
,,10,100
[,1] [,2] [,3] ... [,19] [,20]
[1,] 0.5360649 0.5491561 -0.4098350 . 0.7647435 0.5640699
[2,] 0.7924093 -0.7395815 -1.3792913 . 0.1980287 -0.2897026
... . . . . . .
[49,] 0.6266209 0.3778512 1.4995778 . -0.3820651 -1.4241691
[50,] 1.9218715 3.5475949 0.5963763 . 0.4005210 0.4385623
H.
On 9/16/19 00:54, Michael Chirico wrote:
> Awesome. Gabe, since you already have a workshopped version, would you like
> to proceed? Feel free to ping me to review the patch once it's posted.
>
> On Mon, Sep 16, 2019 at 3:26 PM Martin Maechler <maechler using stat.math.ethz.ch>
> wrote:
>
>>>>>>> Michael Chirico
>>>>>>> on Sun, 15 Sep 2019 20:52:34 +0800 writes:
>>
>> > Finally read in detail your response Gabe. Looks great,
>> > and I agree it's quite intuitive, as well as agree against
>> > non-recycling.
>>
>> > Once the length(n) == length(dim(x)) behavior is enabled,
>> > I don't think there's any need/desire to have head() do
>> > x[1:6,1:6] anymore. head(x, c(6, 6)) is quite clear for
>> > those familiar with head(x, 6), it would seem to me.
>>
>> > Mike C
>>
>> Thank you, Gabe, and Michael.
>> I did like Gabe's proposal already back in July but was
>> busy and/or vacationing then ...
>>
>> If you submit this with a patch (that includes changes to both
>> *.R and *.Rd , including some example) as "wishlist" item to R's
>> bugzilla, I'm willing/happy to check and commit this to R-devel.
>>
>> Martin
>>
>>
>> > On Sat, Jul 13, 2019 at 8:35 AM Gabriel Becker
>> > <gabembecker using gmail.com> wrote:
>>
>> >> Hi Michael and Abby,
>> >>
>> >> So one thing that could happen that would be backwards
>> >> compatible (with the exception of something that was an
>> >> error no longer being an error) is head and tail could
>> >> take vectors of length (dim(x)) rather than integers of
>> >> length for n, with the default being n=6 being equivalent
>> >> to n = c(6, dim(x)[2], <...>, dim(x)[k]), at least for
>> >> the deprecation cycle, if not permanently. It not
>> >> recycling would be unexpected based on the behavior of
>> >> many R functions but would preserve the current behavior
>> >> while granting more fine-grained control to users that
>> >> feel they need it.
>> >>
>> >> A rapidly thrown-together prototype of such a method for
>> >> the head of a matrix case is as follows:
>> >>
>> >> head2 = function(x, n = 6L, ...) { indvecs =
>> >> lapply(seq_along(dim(x)), function(i) { if(length(n) >=
>> >> i) { ni = n[i] } else { ni = dim(x)[i] } if(ni < 0L) ni =
>> >> max(nrow(x) + ni, 0L) else ni = min(ni, dim(x)[i])
>> >> seq_len(ni) }) lstargs = c(list(x),indvecs, drop = FALSE)
>> >> do.call("[", lstargs) }
>> >>
>> >>
>> >> > mat = matrix(1:100, 10, 10)
>> >>
>> >> > *head(mat)*
>> >>
>> >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>> >>
>> >> [1,] 1 11 21 31 41 51 61 71 81 91
>> >>
>> >> [2,] 2 12 22 32 42 52 62 72 82 92
>> >>
>> >> [3,] 3 13 23 33 43 53 63 73 83 93
>> >>
>> >> [4,] 4 14 24 34 44 54 64 74 84 94
>> >>
>> >> [5,] 5 15 25 35 45 55 65 75 85 95
>> >>
>> >> [6,] 6 16 26 36 46 56 66 76 86 96
>> >>
>> >> > *head2(mat)*
>> >>
>> >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>> >>
>> >> [1,] 1 11 21 31 41 51 61 71 81 91
>> >>
>> >> [2,] 2 12 22 32 42 52 62 72 82 92
>> >>
>> >> [3,] 3 13 23 33 43 53 63 73 83 93
>> >>
>> >> [4,] 4 14 24 34 44 54 64 74 84 94
>> >>
>> >> [5,] 5 15 25 35 45 55 65 75 85 95
>> >>
>> >> [6,] 6 16 26 36 46 56 66 76 86 96
>> >>
>> >> > *head2(mat, c(2, 3))*
>> >>
>> >> [,1] [,2] [,3]
>> >>
>> >> [1,] 1 11 21
>> >>
>> >> [2,] 2 12 22
>> >>
>> >> > *head2(mat, c(2, -9))*
>> >>
>> >> [,1]
>> >>
>> >> [1,] 1
>> >>
>> >> [2,] 2
>> >>
>> >>
>> >> Now one thing to keep in mind here, is that I think we'd
>> >> either a) have to make the non-recycling behavior
>> >> permanent, or b) have head treat data.frames and matrices
>> >> different with respect to the subsets they grab (which
>> >> strikes me as a *Bad Plan *(tm)).
>> >>
>> >> So I don't think the default behavior would ever be
>> >> mat[1:6, 1:6], not because of backwards compatibility,
>> >> but because at least in my intuition that is just not
>> >> what head on a data.frame should do by default, and I
>> >> think the behaviors for the basic rectangular datatypes
>> >> should "stick together". I mean, also because of
>> >> backwards compatibility, but that could *in theory*
>> >> change across a long enough deprecation cycle, but the
>> >> conceptually right thing to do with a data.frame probably
>> >> won't.
>> >>
>> >> All of that said, is head(mat, c(6, 6)) really that much
>> >> easier to type/better than just mat[1:6, 1:6, drop=FALSE]
>> >> (I know this will behave differently if any of the dims
>> >> of mat are less than 6, but if so why are you heading it
>> >> in the first place ;) )? I don't really have a strong
>> >> feeling on the answer to that.
>> >>
>> >> I'm happy to put a patch for head.matrix,
>> >> head.data.frame, tail.matrix and tail.data.frame, plus
>> >> documentation, if people on R-core are interested in
>> >> this.
>> >>
>> >> Note, as most here probably know, and as alluded to
>> >> above, length(n) > 1 for head or tail currently give an
>> >> error, so this would be an extension of the existing
>> >> functionality in the mathematical extension sense, where
>> >> all existing behavior would remain identical, but the
>> >> support/valid parameter space would grow.
>> >>
>> >> Best, ~G
>> >>
>> >>
>> >> On Fri, Jul 12, 2019 at 4:03 PM Abby Spurdle
>> >> <spurdle.a using gmail.com> wrote:
>> >>
>> >>> > I assume there are lots of backwards-compatibility
>> >>> issues as well as valid > use cases for this behavior,
>> >>> so I guess defaulting to M[1:6, 1:6] is out of > the
>> >>> question.
>> >>>
>> >>> Agree.
>> >>>
>> >>> > Is there any scope for adding a new argument to
>> >>> head.matrix that would > allow this flexibility?
>> >>>
>> >>> I agree with what you're trying to achieve. However,
>> >>> I'm not sure this is as simple as you're suggesting.
>> >>>
>> >>> What if the user wants "head" in rows but "tail" in
>> >>> columns. Or "head" in rows, and both "head" and "tail"
>> >>> in columns. With head and tail alone, there's a
>> >>> combinatorial explosion.
>> >>>
>> >>> Also, when using tail on an unnamed matrix, it may be
>> >>> desirable to name rows and columns.
>> >>>
>> >>> And all of this assumes standard matrix objects. Add in
>> >>> a matrix subclasses and related objects, and things get
>> >>> more complex still.
>> >>>
>> >>> As I suggested in a another thread, a few days ago, I'm
>> >>> planning to write an R package for matrices and
>> >>> matrix-like objects (possibly extending the Matrix
>> >>> package), with an initial emphasis on subsetting,
>> >>> printing and formatting. So, I'm interested to hear
>> >>> more suggestions on this topic.
>> >>>
>> >>> [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> R-devel using r-project.org mailing list
>> >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=sOZlR-nzy_f_Sje6VGA6IXYQM01BO39OQ2zqA8mtaGI&s=VyNGYbk1jJJqirYBwnhKX60dCp31ArtS62RmXKn86O4&e=
>> >>>
>> >>
>>
>> > [[alternative HTML version deleted]]
>>
>> > ______________________________________________
>> > R-devel using r-project.org mailing list
>> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=sOZlR-nzy_f_Sje6VGA6IXYQM01BO39OQ2zqA8mtaGI&s=VyNGYbk1jJJqirYBwnhKX60dCp31ArtS62RmXKn86O4&e=
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=sOZlR-nzy_f_Sje6VGA6IXYQM01BO39OQ2zqA8mtaGI&s=VyNGYbk1jJJqirYBwnhKX60dCp31ArtS62RmXKn86O4&e=
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages using fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list