[R] non-uniqueness in cluster analysis

Wed Dec 3 15:53:22 CET 2003

On Wed, 3 Dec 2003, Bruno Giordano wrote:

> Hi,
> I'm clustering objects defined by categorical variables with a hierarchical
> algorithm - average linkage.
> My distance matrix (general dissimilarity coefficient) includes several
> distances with exactly the same values.
> As I see, a standard agglomerative procedure ignores this problems, simply
> selecting, above equal distances, the one that comes first.
> For this reason the analysis in output depends strongly on the orderings of
> the objects within the raw data matrix.
> Is there a standard procedure to deal with this?

Don't use average linkage!

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595