[R-pkgs] New package "proxy" for distances and similiarities

David Meyer david.meyer at wu-wien.ac.at
Mon Jul 9 15:49:29 CEST 2007


Dear useRs,

a new package for computing distance and similarity matrices made it to 
CRAN, and will propagate to the mirrors soon.

It includes an enhanced version of "dist()" with support for more than 
40 popular similarity and distance measures, both for auto- and 
cross-distances. Some important ones are implemented in C.

The proximity measures are stored in a registry which can easily be 
queried and extended by users at run-time. For adding a new measure, the 
simplest way is to provide the distance measure as a small R function, 
the package code will do the loops on the C code level to create the 
proximity matrix. It is of course also possible to use more efficient C 
implementations---either for the distance measure alone, or the whole 
matrix computation.

Input data is not restricted to matrices: provided the proximity measure 
can handle it, lists and data frames are also accepted.

The formulas for binary proximities can conveniently be specified in the 
a/b/c/d/n format, where the number of concordant/discordant pairs is 
precomputed on the C code level.

We are currently working on support for sparse data.

This is also a "Call for Measures": if you feel that a particular 
similarity of distance measure is missing, please send the formula and a 
reference (or, ideally, the whole registry entry) to one of the package 
maintainers who will happily add it.

David and Christian.




More information about the R-packages mailing list