[BioC] GRanges, strange behavior when chromosomes don't match
Cei Abreu-Goodger
cei at ebi.ac.uk
Fri Jul 9 23:52:16 CEST 2010
Thanks Patrick!
Patrick Aboyoun wrote:
> Cei,
> I checked in a patch to BioC 2.6 (GenomicRanges 1.0.6) and BioC 2.7
> (GenomicRanges 1.1.16) that fixes the issue you found in the intersect
> and setdiff methods for GRanges objects when the two objects don't have
> the same sequence names. These new packages will be available on
> bioconductor.org within the next 36 hours.
>
> The intersect and setdiff method for GRanges now form the union of
> sequence names for both objects before performing their operations. The
> output for the examples you sent are now
>
> > library(GenomicRanges)
> > gr1 <- GRanges(seqnames = c("chr1", "chr2", "chr3", "chr4"),
> + ranges = IRanges(1:4, width = 10),
> + strand = c("-", "+", "+", "+"))
> > gr2 <- GRanges(seqnames = c("chr1", "chr2", "chr5", "chr6"),
> + ranges = IRanges(1:4, width = 10),
> + strand = c("-", "+", "+", "+"))
>
> > intersect(gr1,gr2)
> GRanges with 2 ranges and 0 elementMetadata values
> seqnames ranges strand |
> <Rle> <IRanges> <Rle> |
> [1] chr1 [1, 10] - |
> [2] chr2 [2, 11] + |
>
> seqlengths
> chr1 chr2 chr3 chr4 chr5 chr6
> NA NA NA NA NA NA
>
> > setdiff(gr1,gr2)
> GRanges with 2 ranges and 0 elementMetadata values
> seqnames ranges strand |
> <Rle> <IRanges> <Rle> |
> [1] chr3 [3, 12] + |
> [2] chr4 [4, 13] + |
>
> seqlengths
> chr1 chr2 chr3 chr4 chr5 chr6
> NA NA NA NA NA NA
>
> > sessionInfo()
> R version 2.12.0 Under development (unstable) (2010-07-01 r52425)
> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] GenomicRanges_1.1.16 IRanges_1.7.10
>
>
> Cheers,
> Patrick
>
>
>
> On 7/8/10 4:55 PM, Patrick Aboyoun wrote:
>> Cei,
>> Thanks for the bug report. I am looking into the issue now and will
>> have a patch out shortly.
>>
>>
>> Cheers,
>> Patrick
>>
>>
>> On 7/8/10 9:22 AM, Cei Abreu-Goodger wrote:
>>> Hello all,
>>>
>>> When you have two GRanges objects which don't have the same seqnames
>>> factor, you can get very unexpected behavior.
>>>
>>> Would it be possible to get an error/warning when this is attempted?
>>> I don't know how many functions are affected, perhaps the functions
>>> mentioned below can be modified to produce a more obvious result...
>>>
>>> Many thanks,
>>>
>>> Cei
>>>
>>>
>>> Code example / sessionInfo:
>>>
>>> library(GenomicRanges)
>>> gr1 <- GRanges(seqnames = c("chr1", "chr2", "chr3", "chr4"),
>>> ranges = IRanges(1:4, width = 10),
>>> strand = c("-", "+", "+", "+"))
>>> gr2 <- GRanges(seqnames = c("chr1", "chr2", "chr5", "chr6"),
>>> ranges = IRanges(1:4, width = 10),
>>> strand = c("-", "+", "+", "+"))
>>>
>>> union(gr1,gr2)
>>> # gives expected result, all 6 ranges
>>>
>>> intersect(gr1,gr2)
>>> # includes ranges from the missing gr2 chromosomes
>>>
>>> setdiff(gr1,gr2)
>>> # includes 3 copies of the ranges from the missing gr2 chromosomes
>>>
>>>
>>> sessionInfo()
>>> R version 2.11.0 (2010-04-22)
>>> i386-apple-darwin9.8.0
>>>
>>> locale:
>>> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices datasets utils methods base
>>>
>>> other attached packages:
>>> [1] GenomicRanges_1.0.5 IRanges_1.6.8 Biobase_2.8.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.11.0
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list