[R] which duplicated rows to delete
Petr Pikal
petr.pikal at precheza.cz
Mon Oct 30 11:39:20 CET 2006
Hi
you can use
apply(outer(( (1:10)[1:10%in%x]), x, "=="), 1, which)
to get list of duplicates. But then you will need to specify which
duplicates you want to discard which can be problematic.
HTH
Petr
On 30 Oct 2006 at 11:11, Sřren Merser wrote:
From: Sřren Merser <merser at image.dk>
To: "R - help" <r-help at stat.math.ethz.ch>
Date sent: Mon, 30 Oct 2006 11:11:01 +0100
Subject: [R] which duplicated rows to delete
> Hi
> Say I've this vector with several duplicates
> >x<-c(1,2,3,4,2,6,2,8,2,3)
>
> >which(duplicated(x))
> [1] 5 7 9 10 11
>
> But what I realy want is somthing like:
> List({2,5,7}, {3,10}, ...)
>
> Then from each sublist I can specify which of the duplicate items to
> drop
>
> res<-NULL
> for(vec in myDuplicateList)
> res<-rbind(res, subset(data[vec,], myCrit))
>
> I'll get some of the way by sorting my original data appropriately, as
> it's the second and following rows that are 'marked' as duplicates,
> but that's not quite enough
>
> Hope for some hints
> Kind regards Sřren
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.
Petr Pikal
petr.pikal at precheza.cz
More information about the R-help
mailing list