[R-pkgs] New package "proxy" for distances and similiarities
david.meyer at wu-wien.ac.at
Mon Jul 9 15:49:29 CEST 2007
a new package for computing distance and similarity matrices made it to
CRAN, and will propagate to the mirrors soon.
It includes an enhanced version of "dist()" with support for more than
40 popular similarity and distance measures, both for auto- and
cross-distances. Some important ones are implemented in C.
The proximity measures are stored in a registry which can easily be
queried and extended by users at run-time. For adding a new measure, the
simplest way is to provide the distance measure as a small R function,
the package code will do the loops on the C code level to create the
proximity matrix. It is of course also possible to use more efficient C
implementations---either for the distance measure alone, or the whole
Input data is not restricted to matrices: provided the proximity measure
can handle it, lists and data frames are also accepted.
The formulas for binary proximities can conveniently be specified in the
a/b/c/d/n format, where the number of concordant/discordant pairs is
precomputed on the C code level.
We are currently working on support for sparse data.
This is also a "Call for Measures": if you feel that a particular
similarity of distance measure is missing, please send the formula and a
reference (or, ideally, the whole registry entry) to one of the package
maintainers who will happily add it.
David and Christian.
More information about the R-packages