[BioC] IRanges: list columns in RangedData objects (was Re: IRanges: cbind not well defined for RangedData?)
Patrick Aboyoun
paboyoun at fhcrc.org
Fri Mar 19 20:59:43 CET 2010
Michael,
Thanks for the report. RangedData objects have been designed to hold
list objects in the values columns. You did, however, find a bug the
printing of a RangedData object when it contains a list column. I fixed
the show method in both BioC 2.5 IRanges (>= 1.4.16) and BioC 2.6
IRanges (>= 1.5.66) to handle this case.
> rd <- RangedData(IRanges(start=1:4, width=10, names=paste("a",1:4)),
space=1:2 )
> rd$a.value <- rnorm(4)
> rd$a.list <- as.list(1:4)
> rd
RangedData with 4 rows and 2 value columns across 2 spaces
space ranges | a.value a.list
<character> <IRanges> | <numeric> <list>
a 1 1 [1, 10] | 0.5362468 ########
a 3 1 [3, 12] | 0.5459593 ########
a 2 2 [2, 11] | 0.4705777 ########
a 4 2 [4, 13] | 0.4160833 ########
As you noticed, a list column in a RangedData object will result in
column expansion if you convert it to a data.frame, which can lead to
large data object is the number of rows in a RangedData object is large.
Since the show method prints out the classes of each of the columns, the
user will be able to check to ensure their data columns are stored
correctly prior to any conversion to a data.frame.
> as.data.frame(rd)
space start end width names a.value a.list.1L a.list.2L a.list.3L
a.list.4L
1 1 1 10 10 a 1 0.5362468 1 2
3 4
2 1 3 12 10 a 3 0.5459593 1 2
3 4
3 2 2 11 10 a 2 0.4705777 1 2
3 4
4 2 4 13 10 a 4 0.4160833 1 2
3 4
Patrick
On 3/19/10 7:23 AM, Michael Dondrup wrote:
> Dear Patrick and Michael,
>
> thank you very much for your helpful support on my last two connected issued! It is somehow in
> the documentation in the examples but I must have overlooked it.
>
> I tried it out immediately, and it works fine:
>
>
>> rd = RangedData(IRanges(start=1:4, width=10, names=paste("a",1:4)), space=1:2 )
>> rd
>> rd$a.value = rnorm(4)
>> rd
>>
> RangedData with 4 rows and 1 value column across 2 spaces
> space ranges | a.value
> <character> <IRanges> |<numeric>
> 1 1 [1, 10] | -0.6765515
> 2 1 [3, 12] | 1.5406962
> 3 2 [2, 11] | -1.2599696
> 4 2 [4, 13] | 0.4971178
>
> But then I had to reboot my computer because by accident tried this on a 100,000 ranges
> and the value was actually a list, not a vector, and then the re-cycling rule struck me:
>
>
>> rd$a.list = as.list(1:4)
>>
> first everything seems fine and normal but if you try to print it:
>
>> rd
>>
> RangedData with 4 rows and 1 value column across 2 spaces
> Error in .Method(..., deparse.level = deparse.level) :
> number of rows of matrices must match (see arg 2)
> or try to convert into a data.frame:
>
>> as.data.frame(rd)
>>
> space start end width names a.list.1L a.list.2L a.list.3L a.list.4L
> 1 1 1 10 10 a 1 1 2 3 4
> 2 1 3 12 10 a 3 1 2 3 4
> 3 2 2 11 10 a 2 1 2 3 4
> 4 2 4 13 10 a 4 1 2 3 4
>
> as I tried this, I R ran into some memory problems.
>
> This just as a warning, to make sure you really use a vector here. Maybe something to put in the
> type checking, or documentation?
>
> Anyway, thanks a lot again
> Michael
>
>
More information about the Bioconductor
mailing list