[R] non-uniqueness in cluster analysis
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Dec 3 15:53:22 CET 2003
On Wed, 3 Dec 2003, Bruno Giordano wrote:
> Hi,
> I'm clustering objects defined by categorical variables with a hierarchical
> algorithm - average linkage.
> My distance matrix (general dissimilarity coefficient) includes several
> distances with exactly the same values.
> As I see, a standard agglomerative procedure ignores this problems, simply
> selecting, above equal distances, the one that comes first.
> For this reason the analysis in output depends strongly on the orderings of
> the objects within the raw data matrix.
> Is there a standard procedure to deal with this?
Don't use average linkage!
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list