[R] Avoiding loops using 'for' and pairwise comparison of columns
Blaser Nello
nblaser at ispm.unibe.ch
Mon Jun 24 12:18:00 CEST 2013
Here's a possible solution to avoid the loop
k <- as.matrix(expand.grid(1:ncol(x),1:ncol(x)))
a1 <- as.data.frame(matrix(sapply(1:nrow(k), function(n)
agree(x[,k[n,]])$value), nrow=ncol(x)))
colnames(a1) <- colnames(x)
rownames(a1) <- colnames(x)
> identical(a, a1)
[1] TRUE
Or if you want to avoid double calculation,
a2 <- as.data.frame(matrix(0, nrow=ncol(x), ncol=ncol(x)))
colnames(a2) <- colnames(x)
rownames(a2) <- colnames(x)
k <- t(combn(1:ncol(x), 2))
a2[lower.tri(a2)] <- sapply(1:nrow(k), function(n)
agree(x[,k[n,]])$value)
a2 <- a2+diag(100,ncol(x))
a2[upper.tri(a2)] <- t(a2)[upper.tri(a2)]
> identical(a, a2)
[1] TRUE
Best,
Nello
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Kulupp
Sent: Montag, 24. Juni 2013 11:02
To: r-help at r-project.org
Subject: [R] Avoiding loops using 'for' and pairwise comparison of
columns
Dear R-experts,
I'd like to avoid the use of very slow 'for'-loops but I don't know how.
My data look as follows (the original data has 1600 rows and 30
columns):
# data example
c1 <- c(1,1,1,0.25,0,1,1,1,0,1)
c2 <- c(0,0,1,1,0,1,0,1,0.5,1)
c3 <- c(0,1,1,1,0,0.75,1,1,0.5,0)
x <- data.frame(c1,c2,c3)
I need to compare every column with each other and want to know the
percentage of similar values for each column pair. To calculate the
percentage of similar values I used the function 'agree' from the
irr-package. I solved the problem with a loop that is very slow.
library(irr) # required for the function 'agree'
# empty data frame for the results
a <- as.data.frame(matrix(data=NA, nrow=3, ncol=3))
colnames(a) <- colnames(x)
rownames(a) <- colnames(x)
# the loop to write the data
for (j in 1:ncol(x)){
for (i in 1:ncol(x)){
a[i,j] <- agree(cbind(x[,j], x[,i]))$value } }
I would be very pleased to receive your suggestions how to avoid the
loop. Furthermore the resulting data frame could be displayed as a
diagonal matrix without duplicates of each pairwise comparison, but I
don't know how to solve this problem.
Kind regards
Thomas
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list