[Bioc-sig-seq] readFastq() error
Martin Morgan
mtmorgan at fhcrc.org
Thu Mar 24 03:44:40 CET 2011
On 03/23/2011 05:49 PM, joseph wrote:
> Hi Martin
> here is what I got:
> x = readLines('~/myDir/reads.fq')
> rd = x[c(FALSE, TRUE, FALSE, FALSE)]
> qual = x[c(FALSE, FALSE, FALSE, TRUE)]
> > which(nchar(rd) != nchar(qual))
> [1] 16509910
> # that is all the reads in the file
> # When I tried to count the reads with the same number of characters, I
> also got all the reads
> > length(which(nchar(rd) == nchar(qual)))
> [1] 16509909
I suspect there is a missing end-of-line on the last line of the file.
>
> Joseph
>
>
>
> ------------------------------------------------------------------------
> *From:* Martin Morgan <mtmorgan at fhcrc.org>
> *To:* joseph <jdsandjd at yahoo.com>
> *Cc:* bioc-sig-sequencing at r-project.org
> *Sent:* Wed, March 23, 2011 4:21:25 PM
> *Subject:* Re: [Bioc-sig-seq] readFastq() error
>
> On 03/23/2011 04:07 PM, Martin Morgan wrote:
> > On 03/23/2011 03:58 PM, joseph wrote:
> >> Hello
> >> How would you fix a FASTQ file that gives the following error when
> >> read with
> >> readFastq()?
> >> Other lanes from the same flow cell are imported fine with readFastq().
> >>
> >> rfq = readFastq("~/myDir", pattern="reads.fq")
> >> Error: Input/Output
> >> file(s):
> >> ~/myDir/reads.fq
> >> message: IncompatibleTypes
> >> message: invalid class "ShortReadQ" object: some sread and quality
> widths
> >> differ
> >>
> >
> > you could read the file in
> >
> > x = readLines('~/myDir/reads.fq')
> >
> > split it into reads and qualities
> >
> > rd = x[c(FALSE, TRUE, FALSE, FALSE)]
> > qual = x[c(FALSE, FALSE, TRUE, FALSE)]
>
> oops, x[c(FALSE, FALSE, FALSE, TRUE)]
>
> >
> > and ask which have different numbers of characters
> >
> > which(nchar(rd) != nchar(qual))
> >
> > Martin
> >
> >> head reads.fq
> >> @GAII_0001:6:1:0:101#0/1
> >> NCTCANCATTGTTTGGACGGAACAAAACCGGGGACAATCT
> >> +GAII_0001:6:1:0:101#0/1
> >> BX[_\B_VXGQQU]]]YTPMGWTZZTVQ_X[TGYPZG[WZ
> >> @GAII_0001:6:1:0:123#0/1
> >> NGTGANTCNGCTCATTGCGAGTTTTAACCTTTTCTCTATC
> >> +GAII_0001:6:1:0:123#0/1
> >> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> >> @GAII_0001:6:1:0:168#0/1
> >> NCCAGNCCCAGCAGCCCTTCCTTTTCCCTGCTTACCCTCA
> >>
> >>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> Bioc-sig-sequencing mailing list
> >> Bioc-sig-sequencing at r-project.org
> <mailto:Bioc-sig-sequencing at r-project.org>
> >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> >
> >
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioc-sig-sequencing
mailing list