[R] similarity measure

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Mar 29 16:14:12 CEST 2010


On Fri, Mar 26, 2010 at 7:12 AM, karuna m <m_karuna2002 at yahoo.com> wrote:
> hi all,
> I am doing hierarchical clustering using similarity measures for binary data using package ade4 and hclust function. For method=8 and method = 9 of dist.binary, I am getting Na values. Hence, hclust function is giving error as Error in hclust(d8, method = "ward") :   NA/NaN/Inf in foreign function call (arg 11). I think the fact that due to zero in the denominator of the similarity measure formula (sqrt(a+b)(a+c)(b+d)(d+c)). Could someone please help me to rectify the error?

In your mind, what would the "correct" distance be between your two
example if a zero does end up in the denominator? Meaning: I guess R
doesn't know what to do, and if you have some intuition, perhaps you
can tell it.

For instance, how bad would it be if you just added 1 (or some smaller
number) to all the points in your data so that you can't get zeroes in
the denominator?

Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the R-help mailing list