[R] Avoiding loops using 'for' and pairwise comparison of columns
Kulupp
kulupp at online.de
Mon Jun 24 11:01:41 CEST 2013
Dear R-experts,
I'd like to avoid the use of very slow 'for'-loops but I don't know how.
My data look as follows (the original data has 1600 rows and 30 columns):
# data example
c1 <- c(1,1,1,0.25,0,1,1,1,0,1)
c2 <- c(0,0,1,1,0,1,0,1,0.5,1)
c3 <- c(0,1,1,1,0,0.75,1,1,0.5,0)
x <- data.frame(c1,c2,c3)
I need to compare every column with each other and want to know the
percentage of similar values for each column pair. To calculate the
percentage of similar values I used the function 'agree' from the
irr-package. I solved the problem with a loop that is very slow.
library(irr) # required for the function 'agree'
# empty data frame for the results
a <- as.data.frame(matrix(data=NA, nrow=3, ncol=3))
colnames(a) <- colnames(x)
rownames(a) <- colnames(x)
# the loop to write the data
for (j in 1:ncol(x)){
for (i in 1:ncol(x)){
a[i,j] <- agree(cbind(x[,j], x[,i]))$value } }
I would be very pleased to receive your suggestions how to avoid the
loop. Furthermore the resulting data frame could be displayed as a
diagonal matrix without duplicates of each pairwise comparison, but I
don't know how to solve this problem.
Kind regards
Thomas
More information about the R-help
mailing list