[R] binary distance measure of the "dist" function in the "stats" package

David Carlson dcarlson at tamu.edu
Thu Jul 18 17:29:46 CEST 2013


If you read ?dist, it says that:

binary:
    (aka asymmetric binary): The vectors are regarded as
binary bits, so non-zero elements are ‘on’ and zero elements
are ‘off’. The distance is the proportion of bits in which
only one is on amongst those in which at least one is on.

The short answer is that this is the Jaccard Distance measure.
If we label the cells of a 2x2 presence absence matrix as a,
b, c, d:

          Present  Absent
Present      a        b
Absent       c        d

Then the Jaccard Similarity index is a/(a+b+c)
And the Jaccard Distance index is (1 - a/(a+b+c)) or
(b+c)/(a+b+c)

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352




-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of ??
Sent: Thursday, July 18, 2013 9:56 AM
To: r-help at r-project.org
Subject: [R] binary distance measure of the "dist" function in
the "stats" package

Dear all:
    I want to ask question about "binary" distance measure. As
far as I
know, there are many binary distance measures,eg, binary
Jarcad distance,
binary euclidean distance, and binary Bray-Curtis
distance,etc. It is even
more confusing because many have more than one name. So , I
wan to know
what the definite name of  the binary distance measure of the
"dist"
function in the "stats" package is and further want to know
the equation of
the binary distance. Thank you very much!
With  my  best  regards.

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.



More information about the R-help mailing list