[BioC] Did the behavior of as.vector(Rle(some.factor)) change on purpose?

Tue Aug 31 16:15:11 CEST 2010

Hi all,

It looks as if the as.vector call to a run length encoded factor turns
it to a vector of characters.

Did this happen on accident, or was it a deliberate design decision?

Previously:

R-2.12, IRanges_1.7.19, GenomicRanges_1.1.20
(A factor of length one is returned):

R> a <- Rle(strand(c('+', '-', '+', '+', '-')))
R> as.vector(a[1])
[1] +
Levels: + - *

=============================

Now:
R-2.12, IRanges_1.7.31, GenomicRanges_1.1.20 (The factor is converted
to a character)

R> a <- Rle(strand(c('+', '-', '+', '+', '-')))
R> as.vector(a[1])
[1] "+"

It seems like it would do what is expected (by me :-) if the
`getMethod('as.vector', c("Rle", "missing"))` was changed from:

function (x, mode = "any")
rep.int(as.vector(runValue(x)), runLength(x))

To:

function (x, mode = "any")
rep.int(runValue(x), runLength(x))

but, upon further inspection, it seems like this was how it was
defined previously anyway, so ... I guess something motivated this
change?

The complete sessionInfo for my last (buggy(?)) case is:

R version 2.12.0 Under development (unstable) (2010-07-07 r52477)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=C
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C
               LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] GenomicRanges_1.1.20 IRanges_1.7.31

loaded via a namespace (and not attached):
[1] tools_2.12.0

Thanks,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact