[R] any way to make it work faster (deleting rows that contain certain values)
Dimitri Liakhovitski
ld7631 at gmail.com
Tue Sep 22 20:07:24 CEST 2009
Hello, dear R'ers,
index<-expand.grid(1:7,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4)
In this case, dim(index) is 7,340,032 (!) and 11.
I realize it's huge.
Then, I am trying to get rid of the undesired combinations of columns.
They should not contain identical values in any 2 columns.
Also if column 1 has a value of 5, there should be no 2 in any other column,
if column 1 has a value of 6, there should be no 3 in any other column, and
column 1 has a value of 7, there should be no 4 in any other column.
I worte a generic script to achieve that (below).
However, I was wondering if it's possible to make it any faster - it
looks like with that huge index it's going to take me days...
Thanks a lot for any suggestion!
Dimitri
index<-expand.grid(1:7,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4)
bad.pairs<-matrix(c(1,1,2,2,3,3,4,4,5,2,6,3,7,4),nrow=7,ncol=2,byrow=T)
for(i in 1:ncol(index)){ # looping through columns of the "index"
for(pair in 1:nrow(bad.pairs)){ # looping through rows of "bad.pairs"
keep<-sapply(1:nrow(index), function(x){
temp<-(index[[x,i]]==bad.pairs[pair,1]) &
(any(index[x,-i]==bad.pairs[pair,2]))
return(temp)
})
index<-index[!keep,]
}
}
--
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com
More information about the R-help
mailing list