[Bioc-sig-seq] reading 454 data with ShortRead

Dan Bolser dan.bolser at gmail.com
Wed Aug 19 19:05:53 CEST 2009


OK, I got what I wanted with the following:


library(ShortRead)

experimentPath <-
  "./Fasta/"

readPath <- experimentPath
qualPath <- experimentPath

my.rp <-
  RochePath(experimentPath=experimentPath,
            readPath=readPath,
            qualPath=qualPath)


pattern <- "FX0QVYF02"

x <-
  readFasta(my.rp, pattern)
y <-
  readQual(my.rp, pattern)

show(x)

mean(width(x))


Is there an equivalent 'qa' function on the RochePath?


As an aside, what tool would you recommend to align 454 reads to a
reference genome?


Cheers,
Dan.



2009/8/19 Dan Bolser <dan.bolser at gmail.com>:
> I'm trying to use this function (read454) but it keeps dying..
>
>
> library(ShortRead)
>
> experimentPath <-
>  "Fasta/"
>
> readPath <- experimentPath
> qualPath <- experimentPath
>
> pattern <- "FUBJ29X06.sff."
>
> reads <-
>  read454(RochePath(experimentPath=experimentPath,
>                    readPath=readPath,
>                    qualPath=qualPath), pattern)
>
> +   read454(RochePath(experimentPath=experimentPath,
> +                     readPath=readPath,
> +                     qualPath=qualPath), pattern)
> Error in file(file, "r") : cannot open the connection
> In addition: Warning message:
> In file(file, "r") : cannot open file 'NA': No such file or directory
>
>
> The contents of the 'Fasta' directory look like this:
>
> ls Fasta/FUBJ29X06.sff.*
> Fasta/FUBJ29X06.sff.fna  Fasta/FUBJ29X06.sff.qual
>
>
>
> Why can't I just specify a fasta file and a quality file?
>
> Cheers,
> Dan.
>
>
> 2009/7/17 Pratap, Abhishek <APratap at som.umaryland.edu>:
>> Hi
>>
>> In that case based on my short experience I think you could specify a pattern while reading files from a directory.
>>
>> pattern-<"regex-pattern-here"
>>
>> reads <- read454(RochePath(experimentPath=experimentPath,readPath=readPath, qualPath=qualPath), pattern)
>>
>> I think should work.
>>
>> Thanks
>> -A
>>
>>
>> -----Original Message-----
>> From: bioc-sig-sequencing-bounces at r-project.org on behalf of Wolfgang Raffelsberger
>> Sent: Fri 7/17/2009 1:35 PM
>> Cc: bioc-sig-sequencing at r-project.org
>> Subject: Re: [Bioc-sig-seq] reading 454 data with ShortRead
>>
>> Dear Michael,
>>
>> Thank's a lot, it works now ...
>> If I understand right, I may not have more than one single pair of
>> sequence-reads (.fna) and quality-scores (.qual) in a given directory,
>> since you don't specify the file-names.
>>
>> Wolfgang
>>
>> Michael Mueller a écrit :
>>> Dear Wolfgang,
>>>
>>> readPath and qualPath specify the paths to the directory containing the
>>> .fna and .qual files rather than the paths to the files themselves.
>>>
>>> Try:
>>>
>>> experimentPath <- "/path/to/directory/containing/fna/and/qual/files"
>>> readPath <- experimentPath
>>> qualPath <- experimentPath
>>>
>>> reads <- read454(RochePath(experimentPath=experimentPath,
>>> readPath=readPath, qualPath=qualPath))
>>>
>>> If only experimentPath is specified all sub-directories of
>>> <experimentPath> matching run are searched for read/quality files.
>>>
>>> Regards,
>>> Michael
>>>
>>> ----------------------------
>>> Michael Mueller
>>> Bioinformatician
>>>
>>> Imperial College London
>>> MRC Clinical Sciences Centre
>>> Du Cane Road
>>> London, W12 0NN
>>> United Kingdom
>>>
>>> phone +44 (0)20 8383 8537
>>> fax   +44 (0)20 8383 8577
>>>
>>>
>>> Wolfgang Raffelsberger wrote:
>>>> Dear list,
>>>> I'm trying to read some 454 data into ShortRead, but I'm having some
>>>> difficulties. Basically I have a FASTA concatenated file with all the
>>>> sequence reads and a FASTA-like file with the Phred-quailty scores,
>>>> the identifyiers for each read from both files match. Here I'm
>>>> testing a small sub-set of the reads, so memory etc shouldn't be an
>>>> issue.
>>>> Based on the ShortRead Vignette I thought could do the job and then I
>>>> realized that there are 454-dedicated functions. But so far I didn't
>>>> get either of them of them to work.
>>>>
>>>>  > library(ShortRead)
>>>>  > Sfile1 <- "01Reads.fna"
>>>>  > Qfile1 <- "01Reads.qual"
>>>>  > tmp <- read454(".", readPath=Sfile1, qualPath=Qfile1)    # I'm
>>>> already in the proper path where both files are located ...
>>>> Error in function (classes, fdef, mtable)  :
>>>>   unable to find an inherited method for function "read454", for
>>>> signature "character"
>>>>  >
>>>>  > # the I tried ...
>>>>  >  tmp <- readXStringColumns(".",pattern=Sfile1,
>>>> colClasses=list("DNAString"))
>>>> Error: Input/Output
>>>>   while reading files 'test16seq.fna':
>>>>     key 62 not in lookup table
>>>>  >
>>>>  > # for completeness :
>>>>  > sessionInfo()
>>>> R version 2.9.1 (2009-06-26)
>>>> i386-pc-mingw32
>>>>
>>>> locale:
>>>> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
>>>>
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>> other attached packages:
>>>> [1] ShortRead_1.2.1   lattice_0.17-25   BSgenome_1.12.3
>>>> Biostrings_2.12.7 IRanges_1.2.3
>>>> loaded via a namespace (and not attached):
>>>> [1] Biobase_2.4.1 grid_2.9.1    hwriter_1.1   tools_2.9.1
>>>>
>>>>
>>>> Thanks in advance,
>>>> Wolfgang
>>>>
>>>>
>>>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
>>>> Wolfgang Raffelsberger, PhD
>>>> Laboratoire de BioInformatique et Génomique Intégratives
>>>> CNRS UMR7104, IGBMC,  1 rue Laurent Fries,  67404 Illkirch
>>>> Strasbourg,  France
>>>> Tel (+33) 388 65 3300         Fax (+33) 388 65 3276
>>>> wolfgang.raffelsberger (at) igbmc.fr
>>>>
>>>> _______________________________________________
>>>> Bioc-sig-sequencing mailing list
>>>> Bioc-sig-sequencing at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>
>>>
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>>        [[alternative HTML version deleted]]
>>
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>



More information about the Bioc-sig-sequencing mailing list