[BioC] [devteam-bioc] Howto achieve reverse complement in a ShortReadQ object
Martin Morgan
mtmorgan at fhcrc.org
Fri May 2 16:28:34 CEST 2014
On 05/02/2014 06:35 AM, Maintainer wrote:
>
> I have a set of data coming from a PacBio RSII circular consensus sequencing
> (CSS). In this dataset, the reads have a random orientation. For my further
> analysis, I require them to be reoriented to have the same orientation, but
> retain the phred quality information and ID, i.e., I need the final data to
> be in a ShortReadQ object.
>
> I cannot find a function that achieves a reverse complement on the sequence
> and the reverse on the quality. Is there one that I'm missing?
>
> for the reverse complement on the sequence I can conduct the following
> command: myReads at sread <- reverseComplement(myReads at sread)
>
> where "myReads" is the ShortReadQ object.
>
> there is also a reverse command, but this is not available for the
> FastqQuality, i.e.,
>
> myReads at quality <- reverse(myReads at quality) does not work!
Hi Thomas --
For a reproducible example I ran
library(ShortRead)
example(readFastq)
and then I have
> rfq
class: ShortReadQ
length: 256 reads; width: 36 cycles
You're correct that reverse, etc., aren't available for ShortReadQ objects (I'll
add them), but only for the underling DNAStringSet and BStringSet objects. I
think you can achieve what you want with
updt = ShortReadQ(reverseComplement(sread(rfq)),
FastqQuality(reverse(quality(quality(rfq)))),
id(rfq))
> head(sread(updt), 3)
A DNAStringSet instance of length 3
width seq
[1] 36 ACAGGAGAAGGAAAGCGAGGGTATCCTACAAAGTCC
[2] 36 GTCCGATGCTGTTCAACCACTAATAGGTAAGAAATC
[3] 36 ACCCAAATTGATATTAATAACACTATAGACCACCGC
> head(quality(updt), 3)
class: FastqQuality
quality:
A BStringSet instance of length 3
width seq
[1] 36 SALPMVHCV]]]]]]]]]]]]Y]Y]]]]]]]]]]]]
[2] 36 KCCIPZP[]T]VPY]]]]]]]]]Y]]]]]]]]]]]]
[3] 36 ALFEXUEJMT]V]]]]]T]]]]]T]V]]]]]Y]]]]
and to update only some, with index 'idx' a logical or integer vector
tmp = rfq[idx]
rfq[idx] = ShortReadQ(reverseComplement(sread(tmp)),
FastqQuality(reverse(quality(quality(tmp)))),
id(tmp))
Nowadays it is almost NEVER necessary to access a slot directly with @; think of
these as private.
Hope that helps!
Martin
>
> I have tried to generate a for loop but run into the issue that the
> FastqQuality object is not subsettable
>
> I highly appreciate if someone could help me find a solution for this
>
> All the best
>
> Tomas
>
>
>
>
> -- output of sessionInfo():
>
> R version 3.1.0 (2014-04-10) Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages: [1] parallel stats graphics grDevices utils
> datasets methods base
>
> other attached packages: [1] ShortRead_1.22.0 GenomicAlignments_1.0.0
> BSgenome_1.32.0 Rsamtools_1.16.0 GenomicRanges_1.16.2 [6]
> GenomeInfoDb_1.0.2 Biostrings_2.32.0 XVector_0.4.0
> IRanges_1.22.3 BiocParallel_0.6.0 [11] BiocGenerics_0.10.0
>
> loaded via a namespace (and not attached): [1] BatchJobs_1.2 BBmisc_1.6
> Biobase_2.24.0 bitops_1.0-6 brew_1.0-6 codetools_0.2-8
> [7] DBI_0.2-7 digest_0.6.4 fail_1.2 foreach_1.4.2
> grid_3.1.0 hwriter_1.3 [13] iterators_1.0.7 lattice_0.20-29
> latticeExtra_0.6-26 plyr_1.8.1 RColorBrewer_1.0-5 Rcpp_0.11.1 [19]
> RSQLite_0.11.4 sendmailR_1.1-2 stats4_3.1.0 stringr_0.6.2
> tools_3.1.0 zlibbioc_1.10.0
>
> -- Sent via the guest posting facility at bioconductor.org.
>
> ________________________________________________________________________
> devteam-bioc mailing list To unsubscribe from this mailing list send a blank
> email to devteam-bioc-leave at lists.fhcrc.org You can also unsubscribe or
> change your personal options at
> https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list