[Bioc-devel] Biostrings: XStringViews subsetting operator does not reduce element metadata

Hervé Pagès hpages at fhcrc.org
Thu Feb 21 20:43:31 CET 2013


Hi Jirka,

Clearly a bug. Thanks for the catch!

The regression was introduced a couple of years ago when the definition
of the Views class was modified. Should be fixed in IRanges release
(1.16.6) and devel (1.17.34). Both should become available via
biocLite() in the next 24 hours or so.

FWIW I did a systematic tour of the "[" methods that are defined for
Vector subclasses in the IRanges package, and found that all of them
are now propagating and subsetting the metadata columns, except the
method for Rle objects, which drops them:

   x <- Rle(13:11, 1:3)
   mcols(x) <- DataFrame(aa=letters[1:6])

Then:

   > x
   integer-Rle of length 6 with 3 runs
     Lengths:  1  2  3
     Values : 13 12 11
   > mcols(x)
   DataFrame with 6 rows and 1 column
              aa
     <character>
   1           a
   2           b
   3           c
   4           d
   5           e
   6           f
   > x[2:1]
   integer-Rle of length 2 with 2 runs
     Lengths:  1  1
     Values : 12 13
   > mcols(x[2:1])
   NULL

I didn't touch that one though. Not sure putting metadata cols on an Rle
is a good idea in the first place, because it kind of defeats the
purpose of using an Rle. I suspect this is the reason why Rle's cannot
receive names:

   > names(x) <- LETTERS[1:6]
   Error in names(x) <- LETTERS[1:6] : class 'Rle' has no 'names' slot

so it's kind of unexpected that you can put metadata cols on them.

Cheers,
H.


On 02/21/2013 04:46 AM, Jiří Hon wrote:
> Hi,
> I found out, that XStringViews/Views subsetting operator does not reduce
> element metadata as I would expect. See the following example. It is
> reproducible in both stable (2.11) and development (2.12) versions of
> Bioconductor.
>
> ------------------------------------------------
> library(Biostrings)
> x <- Views(DNAString("GAA"), 1:2, 2:3)
> elementMetadata(x) <- DataFrame(score=3:4)
> y <- x[1]
> elementMetadata(y) <- elementMetadata(y)
> ------------------------------------------------
>
> The example fails with following error message:
>
> ------------------------------------------------
> Error in `elementMetadata<-`(`*tmp*`, value = <S4 object of class
> "DataFrame">) :
>    the number of rows in elementMetadata 'value' (if non-NULL) must
> match the length of 'x'
> ------------------------------------------------
>
> Output of sessionInfo()
>
> ------------------------------------------------
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=cs_CZ.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=cs_CZ.UTF-8        LC_COLLATE=cs_CZ.UTF-8
>   [5] LC_MONETARY=cs_CZ.UTF-8    LC_MESSAGES=cs_CZ.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=cs_CZ.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] Biostrings_2.26.3  IRanges_1.16.5     BiocGenerics_0.4.0
>
> loaded via a namespace (and not attached):
> [1] parallel_2.15.2 stats4_2.15.2
> ------------------------------------------------
>
> As I do not orient well in XStringViews inheritance structure, I can't
> even guess, where a significance change was made. I only know that in
> Bioc 2.8.4 (R 2.13.1) is this example fully functional.
>
> Please take a look at this issue and if it's not a bug, but a feature,
> please be patient and explain.
>
> Thanks a lot
>
> Jirka

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list