[BioC] endoapply over GRangesList fails when an empty GRanges object is returned (Bug?)
Patrick Aboyoun
paboyoun at fhcrc.org
Wed Aug 25 08:43:58 CEST 2010
Steve,
Thanks for reporting this issue. It turns out that the endoapply
surfaced the error, but didn't cause it. The problem was upstream in the
split,GRanges-method where there was a bug when splitting by a factor or
'factor' Rle that didn't have all levels present. (To check this try
validObject(grl) on your example below.) This issue has been corrected
in BioC 2.6 (GenomicRanges 1.0.8) and BioC 2.7 (GenomicRanges 1.1.21),
which should be available for download from bioconductor.org within 36
hours.
Cheers,
Patrick
On 8/24/10 2:36 PM, Michael Lawrence wrote:
> On Tue, Aug 24, 2010 at 2:21 PM, Steve Lianoglou<
> mailinglist.honeypot at gmail.com> wrote:
>
>> Hi all,
>>
>> I think this is the wrong behavior .. an endoapply(GRangesList,
>> function(...)) will fail if the embedded function returns and empty
>> GRanges object as one of its operations.
>>
>> Calling GRangesList(lapply(GRangesList, function(...))) will work.
>> So too would an endoapply over an IRangesList that does the same thing ...
>>
>> In the code below -- the result that's meant to be stored in "fail" is
>> what I'm talking about. The other two code blocks is just to show you
>> that it works if you do an end-around of the endoapply(GRangesList,
>> ...).
>>
>> I think I would expect the result that comes back in "good" down below
>> to be the same thing that "fail" should be.
>>
>>
>> gr<- GRanges(seqnames='chr1', strand='+',
>> ranges=IRanges(start=c(sample(18:22, 10, replace=TRUE),
>> sample(100:108, 10, replace=TRUE)),
>> width=sample(18:20, 20, replace=TRUE)))
>> strand(gr)[2:5]<- '-'
>> grl<- split(gr, strand(gr))
>>
>> itake<- IRanges(start=100, end=150)
>> gtake<- GRanges(seqnames='chr1', ranges=itake)
>>
>> ## This will fail, don't think it should
>> fail<- endoapply(grl, function(x) {
>> subsetByOverlaps(x, gtake)
>> })
>>
>> ## This works, and ultimately is what the previous one should be
>> good<- GRangesList(lapply(grl, function(x) {
>> take<- GRanges(seqnames='chr1', ranges=itake)
>> subsetByOverlaps(x, gtake)
>> }))
>>
>> ## Test IRangesList: this works.
>> irl<- IRangesList(lapply(grl, ranges))
>>
> I assume someone else will take care of the bug, but wanted to mention that
> you could replace the above command with:
>
> irl<- seqapply(grl, ranges)
>
> endoapply is a stricter version of seqapply that enforces that the return
> value is the same type as the first argument.
>
> check<- endoapply(irl, function(x) {
>> subsetByOverlaps(x, itake)
>> })
>>
>> Thanks,
>> -steve
>>
>> R version 2.12.0 Under development (unstable) (2010-08-18 r52771)
>> Platform: x86_64-apple-darwin10.4.0/x86_64 (64-bit)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] grid stats graphics grDevices utils datasets
>> methods base
>>
>> other attached packages:
>> [1] org.Hs.eg.db_2.4.1 AnnotationDbi_1.11.4
>> [3] Biobase_2.9.0 BSgenome.Hsapiens.UCSC.hg18_1.3.16
>> [5] ggplot2_0.8.8 proto_0.3-8
>> [7] reshape_0.8.3 doMC_1.2.1
>> [9] multicore_0.1-3 foreach_1.3.0
>> [11] codetools_0.2-2 iterators_1.0.3
>> [13] GenomeGraphs_1.9.0 biomaRt_2.5.1
>> [15] bitops_1.0-4.1 Rsamtools_1.1.12
>> [17] data.table_1.5 plyr_1.1
>> [19] BSgenome_1.17.6 Biostrings_2.17.28
>> [21] RSQLite_0.9-2 DBI_0.2-5
>> [23] GenomicFeatures_1.1.11 GenomicRanges_1.1.20
>> [25] IRanges_1.7.19
>>
>> loaded via a namespace (and not attached):
>> [1] RCurl_1.4-3 rtracklayer_1.9.7 tools_2.12.0 XML_3.1-1
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list