[R] More elegant way of excluding rows with equal values in any 2columns?

William Dunlap wdunlap at tibco.com
Mon Sep 21 21:26:56 CEST 2009


Assuming your real dataset isn't the one you showed
(for which e1071::permutation(4) works well) you can
sort each row and then quickly check for duplicates by
comparing each column to the previous column.  E.g.,

f <- function(index){
   rowSort <- function(x){
      x <- t(as.matrix(x))
      x[] <- x[order(col(x), x)]
      t(x)
   }
   tmp <- rowSort(index)
   keep <- rep(T, nrow(tmp))
   if(ncol(tmp)>1) for(i in 2:ncol(tmp))
     keep <- keep & tmp[,i] != tmp[,i-1]
   index[keep,]
} 

f(index)

Some package probably has a row sorting function but
the above works pretty well.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Dimitri 
> Liakhovitski
> Sent: Monday, September 21, 2009 11:14 AM
> To: R-Help List
> Subject: [R] More elegant way of excluding rows with equal 
> values in any 2columns?
> 
> Hello, dear R-ers!
> 
> I built a data frame "grid" (below) with 4 columns. I want to exclude
> all rows that have equal values in ANY 2 columns. Here is how I am
> doing it:
> 
> index<-expand.grid(1:4,1:4,1:4,1:4)
> dim(index)
> # Deleting rows that have identical values in any two columns 
> (1 line of code):
> index<-index[!(index$Var1==index$Var2)&!(index$Var1==index$Var
> 3)&!(index$Var1==index$Var4)&!(index$Var2==index$Var3)&!(index
> $Var2==index$Var4)&!(index$Var3==index$Var4),]
> dim(index)
> index
> 
> 
> I was wondering if there is a more elegant way of doing it - because
> as the number of columns increases, the amount of code one would have
> to write increases A LOT.
> 
> Thank you very much for any suggestion!
> 
> 
> 
> -- 
> Dimitri Liakhovitski
> Ninah.com
> Dimitri.Liakhovitski at ninah.com
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 




More information about the R-help mailing list