[R] Dataframe manipulation question

Wed Oct 6 22:38:02 CEST 2004

Greg Blevins wrote:

> Hello,
> 
> I have a data frame that has three fields.
> 
> Resp#     ActCode     ProdUsed
> 100          3                  2
> 100          3                  2
> 100          4                  3
> 100          4                  3
> 101          3                  6
> 102          2                  1
> 102          3                  1
> 103          5                  1
> 103          5                  1
> 103          3                  2
> 103          3                  2
> 104          3                  1
> 
> What I seek to do.
> 
> If a row following a row is identical for the fields Resp3 and ActCode, I
> then want to delete one of the two matching rows. Based on this logic, the
> resulting df would look like that shown below.
> 
> Resp# ActCode    ProdUsed
> 100      3           2
> 100      4           3
> 101      3           6
> 102      2           1
> 102      3           1
> 103      5           1
> 103      3           2
> 104      3           1
> 
> I have tried match, tried to write something that if the current row minus
> the previous row equal 0 for both Resp# and ActCode, then delete the row,
> but to no avail.  Not knowing what to search on for this problem, I turn to
> the R experts for help.
> 
> (Windows 2000, R 2.0, 384 meg memory)
> Greg Blevins
> The Market Solutions Group

Would ?unique work for you?

 > x
    Resp ActCode ProdUsed
1   100       3        2
2   100       3        2
3   100       4        3
4   100       4        3
5   101       3        6
6   102       2        1
7   102       3        1
8   103       5        1
9   103       5        1
10  103       3        2
11  103       3        2
12  104       3        1
 > unique(x)
    Resp ActCode ProdUsed
1   100       3        2
3   100       4        3
5   101       3        6
6   102       2        1
7   102       3        1
8   103       5        1
10  103       3        2
12  104       3        1

Or if you only want to keep the unique rows of the first two columns then:

 > x[!duplicated(x[, 1:2]), ]

which for this example is identical to using unique directly.

--sundar