[Bioc-sig-seq] Illumina CIF format for ShortRead?
Martin Morgan
mtmorgan at fhcrc.org
Thu Sep 10 20:40:49 CEST 2009
Michael Muratet wrote:
> Greetings
>
> I would like to be able to use the ShortRead package on CIF file data
> from the latest version of the Illumina SCS/Pipeline tools. It appears
> that the current version of ShortRead (1.2.1?) doesn't handle this
> format. I have a snippet of a R script that will read these binary files
> and produce the same output as the Illumina cifToTxt tool and I'm
> willing to do the work to incorporate it into ShortRead. Are there
> already plans to do this? Can anyone point me to document that describes
> the basic syntax and data structures behind R objects of this class?
> I've looked at the ShortRead source and I'm not sure I could figure it
> out just from that.
Hi Michael --
No, ShortRead does not parse CIF format. Is there a specification
somewhere? If you'd like to contribute the relevant parser, that would
be great! You'll probably want to use the development version of R and
of ShortRead (currently 1.3.33).
You're aiming for an object of class AlignedRead, which you would
construct from the bits you parse with a call to
AlignedRead(<your stuff here>)
there is some 'essential' information, like the reads and their quality
scores, the chromosome and position of alignement; other stuff gets put
in an 'AlignedDataFrame'.
See ?AlignedRead and ?"AlignedRead-class" for more. I'm happy to provide
additional guidance, too.
Martin
>
> Hopefully, the CIF format will be around for awhile.
>
> Thanks
>
> Mike
>
> Michael Muratet, Ph.D.
> Senior Scientist
> HudsonAlpha Institute for Biotechnology
> mmuratet at hudsonalpha.org
> (256) 327-0473 (p)
> (256) 327-0966 (f)
>
> Room 4005
> 601 Genome Way
> Huntsville, Alabama 35806
>
>
>
>
>
More information about the Bioc-sig-sequencing
mailing list