[Bioc-devel] Subsetting Lists by Lists

Hervé Pagès hpages at fhcrc.org
Tue Apr 1 20:00:28 CEST 2014


Hi Michael,

On 04/01/2014 07:21 AM, Michael Lawrence wrote:
> Mostly to Herve:
>
> Sometimes we want to pluck the first 1, or 10, or whatever elements from
> each element of a list. If I had a list 'x', I thought I could do this with:
>
> x[IntegerList(1:5)]
>
> But it only gives elements 1:5 from x[[1]], not each element of 'x'. In
> other words, I thought the index would be repped out. Instead, 'x' is
> subset to the length of 'i', and I'm not sure if that makes sense?

Before I reworked the subsetting code last year, subsetting a List by
a shorter List was not supported. For example, in BioC 2.11:

   > cvg
   SimpleRleList of length 3
   $chr1
   integer-Rle of length 11 with 4 runs
     Lengths: 4 1 5 1
     Values : 1 2 3 0

   $chr2
   integer-Rle of length 12 with 5 runs
     Lengths: 1 1 1 7 2
     Values : 0 1 2 3 0

   $chr3
   integer-Rle of length 13 with 6 runs
     Lengths: 6 1 1 1 1 3
     Values : 0 1 2 3 4 0

   > cvg[IntegerList(1:5)]
   Error in seqselect(x, i) :
     'length(start)' must equal 'length(x)' when 'end' and 'width' are NULL

When I reworked the subsetting code, I allowed this and I chose the
current behavior because that returns an object that has the length
of the subscript, which is consistent with what subsetting does in
general, and, most importantly, with what subsetting by a named List
does in particular:

   > cvg[IntegerList(chr2=1:5)]
   RleList of length 1
   $chr2
   integer-Rle of length 5 with 4 runs
     Lengths: 1 1 1 2
     Values : 0 1 2 3

Or said otherwise, I thought it might be a strange thing that the
subscript is not recycled when it's named and recycled when it's
not named. But maybe it's a bigger surprise that 'x[IntegerList(1:5)]'
is only extracting the first 5 elements of 'x[[1]]', and I can see
that it would be convenient to have a simple way to extract them
from *each* list element in 'x'.

Too late to change this for this release unfortunately.

H.

>
> But maybe what we really want are pluckHead/Tail, which would be robust to
> the case that < n elements are in an element. And of course a more general
> pluck(x, i) to select 'i' from each element, but I wanted the line above to
> do that.
>
> Michael
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list