[R] For help in R coding

Bansal, Vikas vikas.bansal at kcl.ac.uk
Mon Jul 4 19:29:12 CEST 2011


Dear sir,

I have one more problem.Sorry to disturb you again.

I have a data frame like this-

Col1     Col2             Col3                  Col4
1             0                  1                        4
 0             0                  0                        2
 4              2                 0                        0
 1              5                 0                        0
 0              0                 4                        3
0               0                 0                        2
0               0                 0                        0
1               1                 0                        5

I want to delete all those rows which have more than two 0s
like in above input  row2 has 3 zeros,row6 has 3 zeros and row 7 has 4 zeros.so i want to exclude them so that my output should be-

Col1     Col2             Col3                  Col4
1             0                  1                        4
4              2                 0                        0
 1              5                 0                        0
 0              0                 4                        3
1               1                 0                        5

Can you please tell me how to code for this problem?






Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London
________________________________________
From: David Winsemius [dwinsemius at comcast.net]
Sent: Monday, July 04, 2011 2:02 AM
To: Bansal, Vikas
Cc: Dennis Murphy; r-help at r-project.org
Subject: Re: [R] For help in R coding

On Jul 3, 2011, at 6:10 PM, Bansal, Vikas wrote:

>> So I want to code so that it will give the output like this-
>>
>> DATA FRAME (Input)

Editing the task so it is reproducible:

dat <- read.table(textConnection(' col3                 col9
   T                      .a,g,,
   A                    .t,t,,
   A                    .,c,c,
   C                     .,a,,,
   G                     .,t,t,t
   A                     .c,,g,^!.
   A                      .g,ggg.^!,
   A                      .$,,,,,.,
   C                      a,g,,t,
   T                      ,,,,,.,^!.
   T                       ,$,,,,.,."'), header=TRUE,
stringsAsFactors=FALSE)

>> output
>>
>> A            C                 G                        T
>> 1             0                  1                        4
>> 4             0                  0                        2
>> 4              2                 0                        0
>> 1              5                 0                        0
>> 0              0                 4                        3

It's also possible to apply the logic that Gabor Grothendieck offered
at the beginning of this thread:

dat[, "newcol"] <- apply(dat, 1, function(x) gsub("\\,|\\." ,x[1],
x[2])  )
# ... and the obvious repetition for C.G.T

 > dat[,"A"] <- nchar( gsub("[^aA]", "", dat[ , "newcol"] ))
 > dat
    col3       col9     newcol A
1     T     .a,g,,     TaTgTT 1
2     A     .t,t,,     AtAtAA 4
3     A     .,c,c,     AAcAcA 4
4     C     .,a,,,     CCaCCC 1
5     G    .,t,t,t    GGtGtGt 0
6     A  .c,,g,^!.  AcAAgA^!A 5
7     A .g,ggg.^!, AgAgggA^!A 4
8     A  .$,,,,,.,  A$AAAAAAA 8
9     C    a,g,,t,    aCgCCtC 1
10    T ,,,,,.,^!. TTTTTTT^!T 0
11    T ,$,,,,.,." T$TTTTTTT" 0

I am deeply in debt to Gabor Grothendieck. He taught me all I know
regarding regex. The man is a master at patterns.

--

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list