[Bioc-sig-seq] ShortRead support for "id", "paired read number" and "multiplex index" when reading an Illumina export file

Martin Morgan mtmorgan at fhcrc.org
Wed Feb 24 15:19:48 CET 2010


Hi Nicolas --

These sounds like very useful additions, and I'll try to incorporate
over the next day or so.

Thank you very much for the contribution!

Martin

On 02/24/2010 02:55 AM, Nicolas Delhomme wrote:
> Hi Martin, everyone,
> 
> I've been looking forward to doing it for a long time now, and,
> finally,  I got the time. So, I dove into the ShortRead C code to add
> some functionalities when loading Illumina export files. I've added an
> option to the readAligned method, specifically for the type
> "SolexaExport" that will in addition to the default information,
> retrieve the multiplex barcode and the paired read number (the 6 and 7th
> column of the export file, that were ignored so far). Additionally,
> using this option will create the sequence identifier (i.e. the one you
> get in a fastq file extracted from an export file) and populate the id
> slot of the alignedRead object.
> 
> I've attached the diff of my local working copy with the revision 44842
> of ShortRead (the current one, as of this morning), two example export
> files (one from a single-end (SE) and one from a paired-end (PE)
> sequencing experiment) and a small R script showing the modified usage.
> 
> I think that these functionalities are very interesting for people, like
> me, who have to analyze PE, multiplexed data, and I'd be glad if they
> got integrated.
> 
> Finally, I'm, by far, not a C expert, so you might wish/(need?) to
> optimize what I've written.
> 
> Best,
> 
> ---------------------------------------------------------------
> Nicolas Delhomme
> 
> High Throughput Functional Genomics Center
> 
> European Molecular Biology Laboratory
> 
> Tel: +49 6221 387 8426
> Email: nicolas.delhomme at embl.de
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
> ---------------------------------------------------------------
> 
> 
> 
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list