[R] which duplicated rows to delete

Mon Oct 30 11:39:20 CET 2006

Hi

you can use

apply(outer(( (1:10)[1:10%in%x]), x, "=="), 1, which)

to get list of duplicates. But then you will need to specify which 
duplicates you want to discard which can be problematic.

HTH
Petr

On 30 Oct 2006 at 11:11, Sřren Merser wrote:

From:           	Sřren Merser <merser at image.dk>
To:             	"R - help" <r-help at stat.math.ethz.ch>
Date sent:      	Mon, 30 Oct 2006 11:11:01 +0100
Subject:        	[R] which duplicated rows to delete

> Hi
> Say I've this vector with several duplicates
> >x<-c(1,2,3,4,2,6,2,8,2,3)
> 
> >which(duplicated(x))
> [1] 5  7  9 10 11
> 
> But what I realy want is somthing like:
> List({2,5,7}, {3,10}, ...)
> 
> Then from each sublist I can specify which of the duplicate items to
> drop
> 
> res<-NULL
> for(vec in myDuplicateList) 
>  res<-rbind(res, subset(data[vec,], myCrit))
> 
> I'll get some of the way by sorting my original data appropriately, as
> it's the second and following rows that are 'marked' as duplicates,
> but that's not quite enough
> 
> Hope for some hints
> Kind regards Sřren
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

Petr Pikal
petr.pikal at precheza.cz