[Bioc-sig-seq] GRanges, failure assigning chromosome lengths

Chris Seidel seidel at phaget4.org
Sat Sep 4 00:07:29 CEST 2010


Did anything ever get resolved in terms of assigning chromosome lengths
to a GRanges object when it contains alignments that run off the
chromosome ends? The message below was the last of the original thread
that I could find.

I'm currently having the problem of reading solexa export files into a
GRanges object, and then sometimes having an error while setting the
chromosome lengths if the object has a few reads that are past the
boundary. The only solution I see is to somehow toss out the offending
reads - which means I have to write a complicated function to loop
through all reads and check them against the chromosome length - so I
was just wondering since Ivan brought this problem up back in April, if
a solution was ever reached. (or if anyone knows of an efficient way to
address the problem).

-Chris

> -----Original Message-----
> From: bioc-sig-sequencing-bounces at r-project.org 
> [mailto:bioc-sig-sequencing-bounces at r-project.org] On Behalf 
> Of Patrick Aboyoun
> Sent: Tuesday, April 27, 2010 12:39 PM
> To: Sean Davis
> Cc: bioc-sig-sequencing at r-project.org
> Subject: Re: [Bioc-sig-seq] GRanges, failure assigning 
> chromosome lengths
> 
> 
> Sean and Ivan,
> Thanks for the insight. I'll look at devising a compromise within the 
> existing framework. I need to explore the various methods for GRanges 
> object to better understand the impact of a compromise. We 
> started with 
> the simplest interpretation of limit bounds because it simplifies the 
> code. For example, we need to establish the rules for coverage or 
> findOverlaps when the DNA is circular or the alignment runs 
> off the end 
> of a linear chromosome.
> 
> 
> Patrick
> 
> 
> On 4/27/10 8:05 AM, Sean Davis wrote:
> > On Tue, Apr 27, 2010 at 10:51 AM, Ivan 
> Gregoretti<ivangreg at gmail.com>  
> > wrote:
> >    
> >> Good morning Sean and everybody,
> >>
> >>      
> >>> Actually, the edge case is general as alignments, even on linear 
> >>> chromosomes, may extend beyond the end of the chromosome, 
> I believe. 
> >>> In the best case, these alignments are clipped (in CIGAR 
> terms), but 
> >>> I don't know that all aligners are doing that appropriately.
> >>>
> >>> Sean
> >>>        
> >> So, you rather go for an overriding switch rather than 
> infrastructure 
> >> overhaul?
> >>
> >> I ask this because GRanges is an exceptionally convenient 
> format for 
> >> ChIP-seqers and Patrick is trying to make a decision to 
> make it work 
> >> for real world data.
> >>      
> > I guess that I mean to say that the two issues of aligning 
> off the end 
> > of the chromosome and handling circular genomes are related but 
> > separate issues.  An override seems quite reasonable for 
> dealing with 
> > the former.  Until aligners or common formats (BAM/SAM) 
> deal with the 
> > latter, it will be difficult to deal appropriately with circular 
> > genomes, so an override is probably a fine compromise.
> >
> > Sean
> >
> >
> >    
> >> And yes indeed: aligners do align a little bit past the boundaries 
> >> even for linear chromosomes. Thanks for pointing that out!
> >>
> >> Ivan
> >>
> >>
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list 
> Bioc-sig-sequencing at r-project.org 
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> 
>



More information about the Bioc-sig-sequencing mailing list