[Bioc-devel] Subsetting Lists by Lists
Hervé Pagès
hpages at fhcrc.org
Tue Apr 1 20:00:28 CEST 2014
Hi Michael,
On 04/01/2014 07:21 AM, Michael Lawrence wrote:
> Mostly to Herve:
>
> Sometimes we want to pluck the first 1, or 10, or whatever elements from
> each element of a list. If I had a list 'x', I thought I could do this with:
>
> x[IntegerList(1:5)]
>
> But it only gives elements 1:5 from x[[1]], not each element of 'x'. In
> other words, I thought the index would be repped out. Instead, 'x' is
> subset to the length of 'i', and I'm not sure if that makes sense?
Before I reworked the subsetting code last year, subsetting a List by
a shorter List was not supported. For example, in BioC 2.11:
> cvg
SimpleRleList of length 3
$chr1
integer-Rle of length 11 with 4 runs
Lengths: 4 1 5 1
Values : 1 2 3 0
$chr2
integer-Rle of length 12 with 5 runs
Lengths: 1 1 1 7 2
Values : 0 1 2 3 0
$chr3
integer-Rle of length 13 with 6 runs
Lengths: 6 1 1 1 1 3
Values : 0 1 2 3 4 0
> cvg[IntegerList(1:5)]
Error in seqselect(x, i) :
'length(start)' must equal 'length(x)' when 'end' and 'width' are NULL
When I reworked the subsetting code, I allowed this and I chose the
current behavior because that returns an object that has the length
of the subscript, which is consistent with what subsetting does in
general, and, most importantly, with what subsetting by a named List
does in particular:
> cvg[IntegerList(chr2=1:5)]
RleList of length 1
$chr2
integer-Rle of length 5 with 4 runs
Lengths: 1 1 1 2
Values : 0 1 2 3
Or said otherwise, I thought it might be a strange thing that the
subscript is not recycled when it's named and recycled when it's
not named. But maybe it's a bigger surprise that 'x[IntegerList(1:5)]'
is only extracting the first 5 elements of 'x[[1]]', and I can see
that it would be convenient to have a simple way to extract them
from *each* list element in 'x'.
Too late to change this for this release unfortunately.
H.
>
> But maybe what we really want are pluckHead/Tail, which would be robust to
> the case that < n elements are in an element. And of course a more general
> pluck(x, i) to select 'i' from each element, but I wanted the line above to
> do that.
>
> Michael
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list