[R] [fixed] vectorized nested loop: apply a function that takes two rows

Jose Quesada quesada at gmail.com
Wed Jan 24 14:53:00 CET 2007


Thanks Charles, Martin,

Substantial improvement with the vectorized solution. Here is a quick benchmark:

# The loop-based solution:
nestedCos = function (x) {
	if (is(x, "Matrix") ) {
		cos 	= array(NA, c(ncol(x), ncol(x)))
		for (i in 2:ncol(x)) {
			for (j in 1:(i - 1)) {
				cos[i, j] = cosine(x[, i], x[, j])
			}
		}
	}
	return(cos)
}
# Charles C. Berry's vectorized approach
flatCos = function (x) {
	res    		= crossprod( x , x )
	diagnl 		= Diagonal( ncol(x), 1 / sqrt( diag( res )))
	cos 		 	= diagnl %*% res %*% diagnl
	return(cos)
}

Benchmarking:

> system.time(for(i in 1:10)nestedCos(x))
(I stopped because it was taking too long)
Timing stopped at: 139.37 3.82 188.76 NA NA
> system.time(for(i in 1:10)flatCos(x))
[1] 0.43 0.00 0.48   NA   NA

#------------------------------------------------------
As much as I like to have faster code, I'm still wondering WHY flatCos gets the same results; i.e., why multiplying the inverse sqrt root of the diagonal of x BY x, then BY the diagonal again produces the expected result. I checked the wikipedia page for crossprod and other sources, but it still eludes me. I can see that scaling by the sqrt of the diagonal once makes sense with 'res <- crossprod( x , x ) gives your result up to scale factors of sqrt(res[i,i]*res[j,j])', but I still don't see why you need to postmultiply by the diagonal again.

Maybe trying to attack a simpler problem might help my understanding: e.g., calculating the cos of a column to all other colums of x (that is, the inner part of the nested loop). How would that work in a vectorized way? I'm trying to get some general technique that I can reuse later from this excellent answer.

Thanks,
-Jose

>
>
> I am rusty on 'Matrix', but I see there are crossprod methods for those
> classes.
>
> 	res <- crossprod( x , x )
>
> gives your result up to scale factors of sqrt(res[i,i]*res[j,j]), so
> something like
>
> 	diagnl <- Diagonal( ncol(x), sqrt( diag( res ) )
>

OOPS! Better make that

   	diagnl <- Diagonal( ncol(x), 1 / sqrt( diag( res ) )

>
> 	final.res <- diagnl %*% res %*% diagnl
>
> should do it.
>

-- 
Cheers,
-Jose

--
Jose Quesada, PhD
Research fellow, Psychology Dept.
Sussex University, Brighton, UK
http://www.andrew.cmu.edu/~jquesada



More information about the R-help mailing list