[R] about subsetting vectors/list in R

Fri Apr 5 04:33:28 CEST 2013

Hi,
You could also use:
library(seqinr)
x1<- rep(c('A','G','C','T'),4)

splitseq(x1,frame=0,word=2)
#[1] "AG" "CT" "AG" "CT" "AG" "CT" "AG" "CT"
splitseq(x1,frame=1,word=2)
#[1] "GC" "TA" "GC" "TA" "GC" "TA" "GC"
 splitseq(x1,frame=0,word=3)
#[1] "AGC" "TAG" "CTA" "GCT" "AGC"

A.K.

----- Original Message -----
From: R. Michael Weylandt <michael.weylandt at gmail.com>
To: Abhishek Pratap <abhishek.vit at gmail.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Thursday, April 4, 2013 9:14 PM
Subject: Re: [R] about subsetting vectors/list in R

On Thu, Apr 4, 2013 at 7:55 PM, Abhishek Pratap <abhishek.vit at gmail.com> wrote:
> On Thu, Apr 4, 2013 at 5:53 PM, R. Michael Weylandt
> <michael.weylandt at gmail.com> wrote:
>> by_two <- function(x, collapse = ""){
>>    dim(x) <- c(length(x) / 2, 2)
>>    apply(x, 1, function(y) paste(y, collapse = collapse))
>> }
>
> Thanks.. just wondering if this will be slick for list/vectors with 100
> thousands of entries. ?

No, the apply() loop likely isn't optimal. But I can do

x <- rep(letters, length.out = 1e6)
system.time(by_two(x)) # Approx 15 seconds

on my slow old machine so this might be one of those cases of "good
enough and come worry about it if profiling shows its a real bottle
neck later".

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.