Thanks a lot Paul and Steve for your help. I see that i have a little bit of
ground to cover and RSamtools seems like a good place to start.
Cheers
Chu
________________________________
From: Paul Leo
To: Steve Lianoglou
Cc: Chu Zhang ; bioc-sig-sequencing@r-project.org
Sent: Fri, 11 February, 2011 2:31:16
Subject: Re: [Bioc-sig-seq] processing alignments
Hi Steve ,
Yes good point!, you would also need to check NM. You probably could not use
straight ##M tags either as softclips and hardclips may appear in the tag but
the sequence may still align perfectly. Chu- the NM tags (as well all other
components you might need) are in BAM (or SAM) and can be accessed with
Rsamtools if you decide to go that way.
Cheers
Paul
-----Original Message-----
From: Steve Lianoglou
To: Paul Leo
Cc: Chu Zhang , bioc-sig-sequencing@r-project.org
Subject: Re: [Bioc-sig-seq] processing alignments
Date: Thu, 10 Feb 2011 20:58:51 -0500
Hi,
On Thu, Feb 10, 2011 at 8:48 PM, Paul Leo wrote:
>
> Also if your alignments are in BAM format , so can use Rsamtools to
> extract that region. Inspection of the cigar will tell you which reads
> aligned perfectly. That would be an extremely fast calculation.
Actually, I'm not sure that that's true, is it? Don't cigar strings
only really tell you about indels?
Say you have two reads, both 38bp long.
If one aligns perfectly, its cigar is 38M
If the other aligns with 1 mismatch, its still 38M.
You can use the NM tag if you're after 'perfect matches', though ...
[[alternative HTML version deleted]]