[Bioc-sig-seq] SOLID data
nicolas servant
Nicolas.Servant at curie.fr
Thu Feb 5 12:12:23 CET 2009
Hi all,
As Daniel said, changing color space in a base space could be a solution
for some applications.
But we have to be careful with this trick, especially for the mapping.
For exemple, in the color space, 2 adjacent mismatches correspond to
only one base mismatche. That's why we have to take into account the
color space proprieties.
Best,
Nicolas
Daniel Klevebring a écrit :
> Hi all,
>
> In the SOLiD system, read are mapped a color-space encoded version of
> the reference sequence in question, after which start and end
> coordinates are reported. Base-space sequence can then (if needed) be
> extracted from the reference sequence using the given coordinates.
>
> There is a "double encoding"-system, where the colors (0, 1, 2, 3) are
> changed to letters (A, C, G, T) to trick certain software to work with
> SOLiD data. This does not correspond to the actual base-space
> sequence, it's only a representation of the color-space sequence. I
> guess it would be possible to use this trick to make BioStrings and
> ShortRead work with SOLiD data.
>
> However, one very important feature of the SOLiD system, is that the
> reverse complement sequence corresponds to the reverse color-space
> sequence (there is no "complement" in color-space). This means that
> the algorithm for returning the rev-comp sequence when prior to
> matching on the (-)-strand need to be re-written to report the reverse
> sequence instead of the rev-comp.
>
> Did all this make sense...? Basically, I think it would be possible to
> make it work if the colors are "double-encoded" and the internal
> function that rev-comps a sequence is modified to report the reverse.
>
> Best
> Daniel Klevebring
>
>
> On 5 feb 2009, at 01.39, Martin Morgan wrote:
>
>> , once reads are represented
>> as traditional nucleotide sequences (which I guess they must be at
>> some point?).
>
> --
>
> Contact information:
>
> Daniel Klevebring
> M. Sc. Eng., Ph.D. Student
> Dept of Gene Technology
> Royal Institute of Technology, KTH
> SE-106 91 Stockholm, Sweden
>
> Visiting address: Roslagstullsbacken 21, B3
> Delivery address: Roslagsvägen 30B, 104 06, Stockholm
> Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,
> SE-10450 Stockholm
> E-mail: daniel at biotech.kth.se <mailto:daniel at biotech.kth.se>
> E-mail: daniel at arrayadvice.se <mailto:daniel at arrayadvice.se>
> Phone: +46 8 5537 8337 (Office)
> Phone: +46 704 71 65 91 (Mobile)
> Web: http://www.biotech.kth.se/genetech/index.html
> Web: http://www.arrayadvice.se/
> Fax: +46 8 5537 8481
> MSN messenger: klevebring at msn.com <mailto:klevebring at msn.com>
>
--
Nicolas Servant
Equipe Bioinformatique
Institut Curie
26, rue d'Ulm - 75248 Paris Cedex 05 - FRANCE
Email: Nicolas.Servant at curie.fr
Tel: 01 56 24 69 85
http://bioinfo.curie.fr/
More information about the Bioc-sig-sequencing
mailing list