[R] Unique rows in data frame (with condition)

Ralf B ralf.bierig at gmail.com
Fri Jul 30 06:18:59 CEST 2010


I have to deal with data frames that contain multiple entries of the
same (based on an identifying collumn 'id'). The second collumn is
mostly corresponding to the the id collumn which means that double
entries can be eliminated with ?unique.

a <- unique(data.frame(timestamp=c(3,3,3,5,8), mylabel=c("a","a","a","b","c")))

However sometimes I have dataframes like this:

a <- unique(data.frame(timestamp=c(3,3,3,5,8), mylabel=c("a","z","a","b","c")))

which then results in:

           timestamp mylabel
1         3       a
2         3       z
4         5       b
5         8       c

However, I want only the first occurance of timestamp and then
selected over the first label resulting in an output like this:

           timestamp mylabel
1         3       a
4         5       b
5         8       c

Is there something like groupBy (like in SQL) ?

Best,
Ralf



More information about the R-help mailing list