[BioC] how to operate on a DNAStringSet object

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Mar 21 22:07:59 CET 2013


And upon one more second's worth of inspection, the endoapply
suggestion actually is a for-loop under the covers, so you won't be
buying time ... I guess the lapply will go faster ...

-steve

On Thu, Mar 21, 2013 at 5:05 PM, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:
> Hi,
>
> On Thu, Mar 21, 2013 at 4:48 PM, Chris Seidel <seidel at phaget4.org> wrote:
> [aggressive clipping]
>
>> What's odd, is that this actually works:
>>
>> DNAStringSet(do.call(c,unlist(myRandomizedseqs)))
>>
>> *IF* the sequences are NOT NAMED.
>
> This (or similar things) have come up before on the ML, but I don't
> have time to search for it right now. I posted a suggestion that I use
> "unname" defensively to sidestep these corner cases. Perhaps that will
> help you find the thread when searching the archives. In any event,
> you could do:
>
> R> DNAStringSet(do.call(c, unname(unlist(...))))
>
> Now that I look at your example, I think the thread I'm talking about
> might have been slightly different, but I guess this should still work
> in your case.
>
>> How does one operate on the sequences of a DNAStringSet object without
>> getting a list back, or without a for loop? I'm sure there's some
>> elegant one-liner that completely escapes me.
>
> To randomize the sequences, you could do:
>
> R> xx <- DNAStringSet(c("GATACA", "GATCCTAA"))
> R> endoapply(xx, sample)
>   A DNAStringSet instance of length 2
>     width seq
> [1]     6 ACGATA
> [2]     8 GTCATAAC
>
> Where did that come from, right?
>
> Note that a DNAStringSet is an IRanges::Vector, and you'll find lots
> of things in the IRangesOverview vignette, which at first might seem
> like to long/detailed to read, but will be worth your time.
>
> Not sure how fast this will be on large XStringSet object, though. You
> may not buy yourself more speed than the for loop, but can't test that
> right now. Perhaps lapply(DNAstringSet, sample) might be faster, but
> I'll leave this as an exercise for the reader.
>
> HTH,
> -steve
>
> --
> Steve Lianoglou
> Defender of The Thesis
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact



-- 
Steve Lianoglou
Defender of The Thesis
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list