[R] For help in R coding

David Winsemius dwinsemius at comcast.net
Mon Jul 4 03:02:49 CEST 2011


On Jul 3, 2011, at 6:10 PM, Bansal, Vikas wrote:

>> So I want to code so that it will give the output like this-
>>
>> DATA FRAME (Input)

Editing the task so it is reproducible:

dat <- read.table(textConnection(' col3                 col9
   T                      .a,g,,
   A                    .t,t,,
   A                    .,c,c,
   C                     .,a,,,
   G                     .,t,t,t
   A                     .c,,g,^!.
   A                      .g,ggg.^!,
   A                      .$,,,,,.,
   C                      a,g,,t,
   T                      ,,,,,.,^!.
   T                       ,$,,,,.,."'), header=TRUE,  
stringsAsFactors=FALSE)

>> output
>>
>> A            C                 G                        T
>> 1             0                  1                        4
>> 4             0                  0                        2
>> 4              2                 0                        0
>> 1              5                 0                        0
>> 0              0                 4                        3

It's also possible to apply the logic that Gabor Grothendieck offered  
at the beginning of this thread:

dat[, "newcol"] <- apply(dat, 1, function(x) gsub("\\,|\\." ,x[1],  
x[2])  )
# ... and the obvious repetition for C.G.T

 > dat[,"A"] <- nchar( gsub("[^aA]", "", dat[ , "newcol"] ))
 > dat
    col3       col9     newcol A
1     T     .a,g,,     TaTgTT 1
2     A     .t,t,,     AtAtAA 4
3     A     .,c,c,     AAcAcA 4
4     C     .,a,,,     CCaCCC 1
5     G    .,t,t,t    GGtGtGt 0
6     A  .c,,g,^!.  AcAAgA^!A 5
7     A .g,ggg.^!, AgAgggA^!A 4
8     A  .$,,,,,.,  A$AAAAAAA 8
9     C    a,g,,t,    aCgCCtC 1
10    T ,,,,,.,^!. TTTTTTT^!T 0
11    T ,$,,,,.,." T$TTTTTTT" 0

I am deeply in debt to Gabor Grothendieck. He taught me all I know  
regarding regex. The man is a master at patterns.

--

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list