[Bioc-sig-seq] SOLID data

nicolas servant Nicolas.Servant at curie.fr
Thu Feb 5 12:12:23 CET 2009


Hi all,

As Daniel said, changing color space in a base space could be a solution 
for some applications.
But we have to be careful with this trick, especially for the mapping.
For exemple, in the color space, 2 adjacent mismatches correspond to 
only one base mismatche. That's why we have to take into account the 
color space proprieties.

Best,

Nicolas

Daniel Klevebring a écrit :
> Hi all, 
>
> In the SOLiD system, read are mapped a color-space encoded version of 
> the reference sequence in question, after which start and end 
> coordinates are reported. Base-space sequence can then (if needed) be 
> extracted from the reference sequence using the given coordinates. 
>
> There is a "double encoding"-system, where the colors (0, 1, 2, 3) are 
> changed to letters (A, C, G, T) to trick certain software to work with 
> SOLiD data. This does not correspond to the actual base-space 
> sequence, it's only a representation of the color-space sequence. I 
> guess it would be possible to use this trick to make BioStrings and 
> ShortRead work with SOLiD data. 
>
> However, one very important feature of the SOLiD system, is that the 
> reverse complement sequence corresponds to the reverse color-space 
> sequence (there is no "complement" in color-space). This means that 
> the algorithm for returning the rev-comp sequence when prior to 
> matching on the (-)-strand need to be re-written to report the reverse 
> sequence instead of the rev-comp.
>
> Did all this make sense...? Basically, I think it would be possible to 
> make it work if the colors are "double-encoded" and the internal 
> function that rev-comps a sequence is modified to report the reverse.
>
> Best
> Daniel Klevebring
>
>
> On 5 feb 2009, at 01.39, Martin Morgan wrote:
>
>> , once reads are represented
>> as traditional nucleotide sequences (which I guess they must be at
>> some point?).
>
> --
>
> Contact information:
>
> Daniel Klevebring
> M. Sc. Eng., Ph.D. Student
> Dept of Gene Technology
> Royal Institute of Technology, KTH
> SE-106 91 Stockholm, Sweden
>
> Visiting address: Roslagstullsbacken 21, B3
> Delivery address: Roslagsvägen 30B, 104 06, Stockholm
> Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075, 
> SE-10450 Stockholm
> E-mail: daniel at biotech.kth.se <mailto:daniel at biotech.kth.se>
> E-mail: daniel at arrayadvice.se <mailto:daniel at arrayadvice.se>
> Phone: +46 8 5537 8337 (Office)
> Phone: +46 704 71 65 91 (Mobile)
> Web: http://www.biotech.kth.se/genetech/index.html
> Web: http://www.arrayadvice.se/
> Fax: +46 8 5537 8481
> MSN messenger: klevebring at msn.com <mailto:klevebring at msn.com>
>


-- 
Nicolas Servant
Equipe Bioinformatique
Institut Curie
26, rue d'Ulm - 75248 Paris Cedex 05 - FRANCE

Email: Nicolas.Servant at curie.fr
Tel: 01 56 24 69 85
http://bioinfo.curie.fr/



More information about the Bioc-sig-sequencing mailing list