[R] question about centroid-linkage (cluster analysis)
james.foadi at diamond.ac.uk
james.foadi at diamond.ac.uk
Thu Dec 10 14:26:04 CET 2009
Dear R community,
I would be greatful if somebody could shed light on the following.
I have created a set of 6 points to check how centroid
agglomeration works in cluster analysis:
> Y <- data.frame(x=c(-1,1,1,-1,10,12),y=c(1,1,-1,-1,0,0))
It is quite intuitive to understand that the last clusters to be joined will be
{1,2,3,4} with {5,6}. Now, the centroid for the first cluster has coordinates (0,0),
while the centroid for the second cluster has coordinates (11,0). Therefore, the
distance between these two cluster should be 11. But:
> Y.dist <- dist(Y)
> Y.hc_c <- hclust(Y.dist,method="centroid")
> Y.hc_c$merge
[,1] [,2]
[1,] -1 -2
[2,] -3 1
[3,] -4 2
[4,] -5 -6
[5,] 3 4
> Y.hc_c$height
[1] 2.000000 1.914214 1.517428 2.000000 9.692575
So, from this it would appear that the distance between the last two clusters is 9.692575!
How can it be?
J
Dr James Foadi PhD
Membrane Protein Laboratory (MPL)
Diamond Light Source Ltd
Diamond House
Harewell Science and Innovation Campus
Chilton, Didcot
Oxfordshire OX11 0DE
Email : james.foadi at diamond.ac.uk
Alt Email: j.foadi at imperial.ac.uk
--
This e-mail and any attachments may contain confidential...{{dropped:8}}
More information about the R-help
mailing list