[R] More elegant way of excluding rows with equal values in any 2columns?
William Dunlap
wdunlap at tibco.com
Mon Sep 21 21:26:56 CEST 2009
Assuming your real dataset isn't the one you showed
(for which e1071::permutation(4) works well) you can
sort each row and then quickly check for duplicates by
comparing each column to the previous column. E.g.,
f <- function(index){
rowSort <- function(x){
x <- t(as.matrix(x))
x[] <- x[order(col(x), x)]
t(x)
}
tmp <- rowSort(index)
keep <- rep(T, nrow(tmp))
if(ncol(tmp)>1) for(i in 2:ncol(tmp))
keep <- keep & tmp[,i] != tmp[,i-1]
index[keep,]
}
f(index)
Some package probably has a row sorting function but
the above works pretty well.
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Dimitri
> Liakhovitski
> Sent: Monday, September 21, 2009 11:14 AM
> To: R-Help List
> Subject: [R] More elegant way of excluding rows with equal
> values in any 2columns?
>
> Hello, dear R-ers!
>
> I built a data frame "grid" (below) with 4 columns. I want to exclude
> all rows that have equal values in ANY 2 columns. Here is how I am
> doing it:
>
> index<-expand.grid(1:4,1:4,1:4,1:4)
> dim(index)
> # Deleting rows that have identical values in any two columns
> (1 line of code):
> index<-index[!(index$Var1==index$Var2)&!(index$Var1==index$Var
> 3)&!(index$Var1==index$Var4)&!(index$Var2==index$Var3)&!(index
> $Var2==index$Var4)&!(index$Var3==index$Var4),]
> dim(index)
> index
>
>
> I was wondering if there is a more elegant way of doing it - because
> as the number of columns increases, the amount of code one would have
> to write increases A LOT.
>
> Thank you very much for any suggestion!
>
>
>
> --
> Dimitri Liakhovitski
> Ninah.com
> Dimitri.Liakhovitski at ninah.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list