[BioC] GenomicRanges:::.similarSeqnameConvention regular expressions needs some tweaking?
Martin Morgan
mtmorgan at fhcrc.org
Tue Aug 24 18:14:46 CEST 2010
On 08/24/2010 08:16 AM, Steve Lianoglou wrote:
> Hi,
>
> Sorry to be a pest about this, but could we get some traction on this?
>
> I've temporarily commented out the isArabic regex test to get around
> this issue as a work around, but want to keep my own/analysis code in
> line w/ the real GenomicRanges package.
We've discussed this locally and will make changes this week. Martin
>
> Thanks,
> -steve
>
>
> On Fri, Aug 20, 2010 at 12:53 PM, Steve Lianoglou
> <mailinglist.honeypot at gmail.com> wrote:
>> Hi all,
>>
>> The GenomicRanges:::.similarSeqnameConvention function is returning
>> FALSE where, IMHO, it shouldn't be.
>>
>> I've landed in a situation where this function is called with the
>> following values for seqs1/2:
>>
>> seqs1:
>> [1] "chr1" "chr1_random" "chr10" "chr10_random"
>> "chr11" "chr11_random"
>> [7] "chr12" "chr13" "chr13_random" "chr14"
>> "chr15" "chr15_random"
>> [13] "chr16" "chr16_random" "chr17" "chr17_random"
>> "chr18" "chr18_random"
>> [19] "chr19" "chr19_random" "chr2" "chr2_random"
>> "chr20" "chr21"
>> [25] "chr21_random" "chr22" "chr22_random" "chr22_h2_hap1"
>> "chr3" "chr3_random"
>> [31] "chr4" "chr4_random" "chr5" "chr5_random"
>> "chr5_h2_hap1" "chr6"
>> [37] "chr6_random" "chr6_cox_hap1" "chr6_qbl_hap2" "chr7"
>> "chr7_random" "chr8"
>> [43] "chr8_random" "chr9" "chr9_random" "chrM"
>> "chrX" "chrX_random"
>> [49] "chrY"
>>
>> seqs2:
>> [1] "chrY"
>>
>> and it looks like the "isArabic" function in funList is the culprit
>> here. Perhaps this regex test is so necessary, given all the other
>> tests that are being run?.
>>
>> I guess it's not so easy to come up w/ a perfect heuristic for this
>> function to check "comparable seqnames", but IMHO, it seems as if my
>> scenario should pass as a "good" (ie. the conventions are similar).
>>
>> Another scenario would be to just have this function return TRUE when
>> the intersection between seqs1 and seqs2 is length 0. I guess that
>> must be too simple though ...
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
>
>
>
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list