[BioC] replace nucleotide at fixed position in a DNAStringSet object

Hervé Pagès hpages at fhcrc.org
Mon Sep 16 20:40:20 CEST 2013


Hi guys,

With Bioc-devel, you can use replaceAt() for this:

   x <- DNAStringSet(c("ATGACCACG", "ACTGGGGAA", "GCCGATGCG"))
   y <- DNAStringSetList(DNAStringSet("G"), DNAStringSet("C"), 
DNAStringSet("C"))

Then:

   > replaceAt(x, IRanges(4, 4), y)
     A DNAStringSet instance of length 3
       width seq
   [1]     9 ATGGCCACG
   [2]     9 ACTCGGGAA
   [3]     9 GCCCATGCG

An important clarification: An XString or XStringSet object is not more
immutable than a character vector or an R object in general in the
sense that we are not supposed to modify it *in-place*, except in some
particular situations where we know it's safe to do so. When it's not
safe to do so, then the object (or part of it) is copied and the copy
is modified. Of course all this is transparent to the end-user who
should never need to worry about whether it is safe or not to call [<-,
[[<- or replaceAt() on his/her DNAStringSet object: copies are made
if needed so those operations are always safe.

Cheers,
H.



On 09/16/2013 09:46 AM, Valerie Obenchain wrote:
> Hi,
>
> On 09/13/2013 07:13 AM, Robert Castelo wrote:
>> hi!!
>>
>> i'd like to know if there is some efficient way to replace a nucleotide
>> at a fixed position in a DNAStringSet object.
>>
>> let's say we have the following toy DNAStringSet object with 3 DNA
>> sequences:
>>
>> x <- DNAStringSet(c("ATGACCACG", "ACTGGGGAA", "GCCGATGCG"))
>> x
>>    A DNAStringSet instance of length 3
>>      width seq
>> [1]     9 ATGACCACG
>> [2]     9 ACTGGGGAA
>> [3]     9 GCCGATGCG
>>
>> and a DNAStringSetList object with the following 3 nucleotides
>>
>> y <- DNAStringSetList(DNAStringSet("G"), DNAStringSet("C"),
>> DNAStringSet("C"))
>> y
>> DNAStringSetList of length 3
>> [[1]] G
>> [[2]] C
>> [[3]] C
>>
>> i'd like to replace the, let's say, fourth nucleotide along the DNA
>> sequences in 'x' by those in 'y'. i can imagine how to do it coercing
>> back and forth to character and so on but i guess there must be some
>> more efficient way to do it.
>
> I don't think so. XString objects are immutable. The data are accessed
> through an external pointer to an environment where they are
> written/stored as raw. To subset/replace positions in 'x' with values
> from 'y' you would need to go through the 'as.character' conversion and
> create a new DNAStringSet.
>
> I've cc Herve in case I've gotten this wrong or he has a different
> solution to the problem.
>
> Valerie
>
>
>
>
> my interest come from the fact that the
>> DNAStringSet object i have to work with can have many DNA sequences.
>>
>> thanks!!
>> robert.
>>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list