[R] Enquiry about Hierarchical Clustering

Martin Maechler maechler at stat.math.ethz.ch
Sat Sep 27 19:11:14 CEST 2003


>>>>> "Adaikalavan" == Adaikalavan RAMASAMY <ramasamya at gis.a-star.edu.sg>
>>>>>     on Sat, 27 Sep 2003 17:05:43 +0800 writes:

    Adaikalavan> Hclust is unable to handle missing values in
    Adaikalavan> dist().  There will be missing values in dist()
    Adaikalavan> function if 1. all elements in a row are
    Adaikalavan> missing 2. all pairs between any two rows have
    Adaikalavan> at least one missing values.

As Kjetial Halvorsen said,  use  daisy() from the cluster
package instead of dist().
The daisy() function has two advantages over dist():
1. Handling of missing values
2. Handling of data with continuous *and* categorical variables.

[Btw, this has not really anything to do with the clustering
 method used *after* the distance has been computed.
 You can use hclust() on a daisy result if you want]

Regards,
Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><




More information about the R-help mailing list