[R] which duplicated rows to delete

Mon Oct 30 11:46:39 CET 2006

Try this.  The first line breaks it up into lists and the second
line drops any list that is not greater than 1 in length:

out <- tapply(seq(x), x, function(x)x)
out[sapply(out, length) > 1]

On 10/30/06, Søren Merser <merser at image.dk> wrote:
> Hi
> Say I've this vector with several duplicates
> >x<-c(1,2,3,4,2,6,2,8,2,3)
>
> >which(duplicated(x))
> [1] 5  7  9 10 11
>
> But what I realy want is somthing like:
> List({2,5,7}, {3,10}, ...)
>
> Then from each sublist I can specify which of the duplicate items to drop
>
> res<-NULL
> for(vec in myDuplicateList)
>        res<-rbind(res, subset(data[vec,], myCrit))
>
> I'll get some of the way by sorting my original data appropriately, as it's
> the second and following rows that are 'marked' as duplicates, but that's
> not quite enough
>
> Hope for some hints
> Kind regards Søren
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>