[R] More elegant way of excluding rows with equal values in any 2 columns?

Erik Iverson eiverson at NMDP.ORG
Mon Sep 21 21:03:50 CEST 2009


It probably does... I did not look at his until just now, my guess is they are equivalent.  There are usually at least a couple ways to do things in R, no problem :).  With massive datasets, it might make sense to try a couple different ways to see if one or the other is faster though.  

You could also replace "!duplicated" in my function with "unique" ... 

Erik 

> -----Original Message-----
> From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
> Sent: Monday, September 21, 2009 2:02 PM
> To: Erik Iverson
> Cc: R-Help List
> Subject: Re: [R] More elegant way of excluding rows with equal values in
> any 2 columns?
> 
> Thank you very much, both Dimitris and Erik.
> Erik - you are right, I was trying to remove any duplication (i.e., if
> there are the same values in 2 or 3 or 4 columns).
> And it looks like that's what your solution does.
> But doesn't it do the same thing as Dimitris' solution?
> 
> Dimitri
> 
> On Mon, Sep 21, 2009 at 2:55 PM, Erik Iverson <eiverson at nmdp.org> wrote:
> > Hello,
> >
> > Do you mean exactly any 2 columns.  What if the value is equal in more
> than 2 columns?
> >
> >>
> >> I built a data frame "grid" (below) with 4 columns. I want to exclude
> >> all rows that have equal values in ANY 2 columns. Here is how I am
> >> doing it:
> >>
> >> index<-expand.grid(1:4,1:4,1:4,1:4)
> >
> > If a value is equal in 2 or more rows, i.e., duplicated, then the
> following should work, assuming index can be changed to a matrix for apply
> ...
> >
> > t3 <- index[apply(index, 1, function(x) all(!duplicated(x))),]
> >
> 
> 
> 
> --
> Dimitri Liakhovitski
> Ninah.com
> Dimitri.Liakhovitski at ninah.com




More information about the R-help mailing list