[R] [fixed] vectorized nested loop: apply a function that takes two rows
Jose Quesada
quesada at gmail.com
Wed Jan 24 14:53:00 CET 2007
Thanks Charles, Martin,
Substantial improvement with the vectorized solution. Here is a quick benchmark:
# The loop-based solution:
nestedCos = function (x) {
if (is(x, "Matrix") ) {
cos = array(NA, c(ncol(x), ncol(x)))
for (i in 2:ncol(x)) {
for (j in 1:(i - 1)) {
cos[i, j] = cosine(x[, i], x[, j])
}
}
}
return(cos)
}
# Charles C. Berry's vectorized approach
flatCos = function (x) {
res = crossprod( x , x )
diagnl = Diagonal( ncol(x), 1 / sqrt( diag( res )))
cos = diagnl %*% res %*% diagnl
return(cos)
}
Benchmarking:
> system.time(for(i in 1:10)nestedCos(x))
(I stopped because it was taking too long)
Timing stopped at: 139.37 3.82 188.76 NA NA
> system.time(for(i in 1:10)flatCos(x))
[1] 0.43 0.00 0.48 NA NA
#------------------------------------------------------
As much as I like to have faster code, I'm still wondering WHY flatCos gets the same results; i.e., why multiplying the inverse sqrt root of the diagonal of x BY x, then BY the diagonal again produces the expected result. I checked the wikipedia page for crossprod and other sources, but it still eludes me. I can see that scaling by the sqrt of the diagonal once makes sense with 'res <- crossprod( x , x ) gives your result up to scale factors of sqrt(res[i,i]*res[j,j])', but I still don't see why you need to postmultiply by the diagonal again.
Maybe trying to attack a simpler problem might help my understanding: e.g., calculating the cos of a column to all other colums of x (that is, the inner part of the nested loop). How would that work in a vectorized way? I'm trying to get some general technique that I can reuse later from this excellent answer.
Thanks,
-Jose
>
>
> I am rusty on 'Matrix', but I see there are crossprod methods for those
> classes.
>
> res <- crossprod( x , x )
>
> gives your result up to scale factors of sqrt(res[i,i]*res[j,j]), so
> something like
>
> diagnl <- Diagonal( ncol(x), sqrt( diag( res ) )
>
OOPS! Better make that
diagnl <- Diagonal( ncol(x), 1 / sqrt( diag( res ) )
>
> final.res <- diagnl %*% res %*% diagnl
>
> should do it.
>
--
Cheers,
-Jose
--
Jose Quesada, PhD
Research fellow, Psychology Dept.
Sussex University, Brighton, UK
http://www.andrew.cmu.edu/~jquesada
More information about the R-help
mailing list