[R] dist like function but where you can configure the method

Jari Oksanen jari.oksanen at oulu.fi
Fri May 16 15:57:38 CEST 2014


Witold E Wolski <wewolski <at> gmail.com> writes:

> 
> Looking for an  fast dist implementation
> where I could pass my own dist function to the "method" parameter
> 
> i.e.
> 
> mydistfun = function(x,y){
>  return(ks.test(x,y)$p.value)   #some mystique implementation
> }
> 
> wow = dist(data,method=mydistfun)

I think it is best to write that function yourself.

The "dist" object is a vector corresponding to a lower triangle
(without the diagonal) of a symmetric matrix and with attributes.
The attributes are class which should be c("mydist", "dist"), Size
which is the length(x), Labels (optional) which are the 
names of your items and if given, should have length(x), 
call = match.call(), Diag = FALSE, Upper = FALSE and method name.
All you need is a vector with attributes.

All this will add very little overhead to your calculation, so
for all practical purposes this implementation is just as fast as 
is your "mystique implementation" of pairwise distances. Your
example (ks.test()) probably would be pretty slow. If you can
vectorize your distance, it can be really fast, even if you 
calculate the full symmetric matrix and throw away the diagonal and
upper triangle.

Cheers, Jari Oksanen



More information about the R-help mailing list