[R] Loop avoidance and logical subscripts

retama retama1745 at gmail.com
Thu May 21 18:56:23 CEST 2009


Patrick Burns kindly provided an article about this issue called 'The R
Inferno'. However, I will expand a little bit my question because I think it
is not clear and, if I coud improve the code it will be more understandable
to other users reading this messages when I will paste it :)

In my example, I have a dataframe with several hundreds of DNA sequences in
the column data$sequences (each value is a long string written in an
alphabet of four characters, which are A, C, T and G). I'm trying to know
parameter number of Gs plus Cs over the total  [G+C/(A+T+C+G)] in each
sequence. In example, data$sequence [1] is something like AATTCCCGGGGGG but
a little bit longer, and, its G+C content is 0.69 . I need to compute a
vector with all G+C contents (in my example, in data$GCsequence, in which
data$GCsequence[1] is 0.69).

So the question was if making a loop and a combination of values with c() or
cbind() or with logical subscripts is ok or not. And which approach should
produce better results in terms of efficiency (my script goes really slow).

Thank you,

Retama


-- 
View this message in context: http://www.nabble.com/Loop-avoidance-and-logical-subscripts-tp23652935p23656703.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list