[BioC] XStringViews with strand?
Hervé Pagès
hpages at fhcrc.org
Thu Jun 4 20:14:51 CEST 2009
Hi Cei,
Cei Abreu-Goodger wrote:
> Hi all,
>
> I'm trying to read in a fasta sequence, extract the "gene sequences" and
> write these out to a fasta file. I can read the sequences with
> read.DNAStringSet(), obtain an XStringViews object with Views(), but I'm
> having trouble knowing how to obtain the reverse complement sequence for
> the genes on the "-" strand. I can get them with a reverseComplement()
> of the XStringViews object, but I can't overwrite the elements of this
> object. So my solution involves dealing separately with all the genes on
> the "+" strand and those on the "-" strand. Is there an easier way?
>
> An example:
>
> file <- system.file("extdata", "someORF.fa", package="Biostrings")
> x <- read.DNAStringSet(file, "fasta")[[1]]
>
> names <- c("a","b","c")
> starts <- c(10,384,947)
> ends <- starts+20
> strands <- c("+","-","+")
>
> myViews <- Views(x, start=starts, end=ends)
> names(myViews) <- names
>
> revViews <- reverseComplement(myViews[strands=="-"])
> posViews <- myViews[strands=="+"]
Yes it has to be done separately. The content of a view cannot be modified
because that would mean modifying the underlying subject so you would end
up with a subject that is a mix of + and - strand. And what should be done
when 2 views overlap, 1 being associated with a gene on the + strand and
the other one with a gene on the - strand?
In the near future we will support replacement of the elements of a DNAStringSet
object through "[<-" and "[[<-". Then it will be possible to reverseComplement
some of its elements with something like this:
x[strands == "-"] <- reverseComplement(x[strands == "-"])
But that will be for XStringSet only, not for XStringViews.
These developments will take place in the devel version of Biostrings in the
next few weeks.
Cheers,
H.
>
>
> Many thanks,
>
> Cei
>
> sessionInfo()
> R version 2.9.0 (2009-04-17)
> i386-apple-darwin8.11.1
>
> locale:
> en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] tools stats graphics grDevices datasets utils methods
> [8] base
>
> other attached packages:
> [1] Biostrings_2.12.1 IRanges_1.2.2 Biobase_2.4.1
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list