[Bioc-sig-seq] readAligned in ShortRead package

Martin Morgan mtmorgan at fhcrc.org
Tue Aug 4 17:44:33 CEST 2009


Martin Morgan wrote:
> Ingunn Berget wrote:
>> Dear All
>>
>> According to the documentation for "readAligned" in package "ShortRead" (version 1.2.1) the match contig column is ignored, is there any easy way of getting this information into R?
> 
> as a work-around, these are text files and you might try
> 
> ## read
> aln <- readAligned(path_to_file, type="SolexaExport")
> what <- rep(list(NULL), 22)
> what[[8]] <- "character"
> contig <- scan(path_to_file, what=what, sep="\t", fill=TRUE)[[8]]

oops! quotes in quality strings will mess up parsing, and we're after 
column 12. So this should be

what[[12]] <- character()
contig <- scan(path_to_file, what=what, sep="\t",
                fill=TRUE, quote="")[[12]]

> ## check contig for correct values
> 
> ## add to alignData
> adata <- alignData(aln)
> adata[["contig", labelDescription="Solexa export 'contig' data"]] <-
>   contig
> 
> ## update AlignedRead
> aln <- initialize(aln, alignData=adata)
> 
> If the files are gz-compressed, then I think you'll want to
> 
> contig <- scan(gzfile(path_to_file), what=what, sep="\t",
>                fill=TRUE)[[8]]
> 
> I will update ShortRead to parse this data into alignData.
> 
> Martin
> 
>>
>> Best regards 
>> Ingunn
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list