[BioC] IRanges: cbind not well defined for RangedData?
Michael Dondrup
Michael.Dondrup at uni.no
Thu Mar 18 15:55:07 CET 2010
Hi,
here is another little possible glitch with RangedData and cbind(), actually would like to propose to
change or expand the behavior of the cbind function or to add to it's documentation. The use-case is as
follows:
Assume we have some chromosomal Ranges in a RangedData object. Then we can iteratively compute statistics on
these ranges and attach them to the DataFrame holding extra data, e.g. some count data or combine qualitiy scores possibly from multiple conditions.
So according to the documentation of the RangedData-class,
> The first mode treats the object as a contiguous "data frame" annotated with range information.
>The accessors start, end, and width get the corresponding fields in the ranges as atomic integer vectors, undoing
> the division over the spaces. The [[ > and matrix-style [, extraction and subsetting functions unroll the data in the same way. [[<- does the inverse.
I assume I could use cbind(rd, a.value) to attach the statistics to the internal data representation. So would it be possible to
make cbind return something more useful, or are there better ways to do it?
Best
Michael
Example:
> a.value = rnorm(4)
> rd1 = RangedData(ranges=IRanges(start=runif(4, min=1, max=10E8), width=runif(4, min=1, max=10E5), names=paste("bla",1:4)), space=1:2)
> rd1
RangedData with 4 rows and 0 value columns across 2 spaces
space ranges |
<character> <IRanges> |
bla 1 1 [773679042, 774010137] |
bla 3 1 [194819013, 195136171] |
bla 2 2 [183105318, 183509803] |
bla 4 2 [107730452, 107823748] |
> obj = cbind(rd1, a.value)
And I would intuitively assume the result to look exactly like this:
> RangedData(ranges=IRanges(start=runif(4, min=1, max=10E8), width=runif(4, min=1, max=10E5), names=paste("bla",1:4)), space=1:2, a.value)
RangedData with 4 rows and 1 value column across 2 spaces
space ranges | a.value
<character> <IRanges> | <numeric>
bla 1 1 [473042533, 473820859] | -1.7956588
bla 3 1 [ 75991383, 76022516] | 0.3588571
bla 2 2 [475385363, 476224756] | 1.4166218
bla 4 2 [532603052, 532902678] | 0.2324424
But what I get is much different:
> class(obj)
[1] "matrix"
> typeof(obj)
[1] "list"
> obj
rd1 a.value
[1,] ? 0.3255676
[2,] ? 0.5913471
[3,] ? 0.9317755
[4,] ? -0.8897527
> sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-apple-darwin9.8.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] IRanges_1.4.9
loaded via a namespace (and not attached):
[1] tools_2.10.1
More information about the Bioconductor
mailing list