[R] distance in the function kmeans

Uwe Ligges ligges at statistik.uni-dortmund.de
Fri May 28 11:11:04 CEST 2004


n.bouget at laposte.net wrote:

>>n.bouget wrote:
>>
>>
>>>Hi,
>>>I want to know which distance is using in the function kmeans
>>>and if we can change this distance. 
>>>Indeed, in the function pam, we can put a distance matrix in
>>>parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but
>>>we can't do it in the function kmeans, we have to put the
>>>matrix of data directly ...
> 
> Yes but how can we choose the distance to calculate centers?

Ah, you are going to use different distance measure (e.g. euclidean, 
manhattan, ...) as in other cluster methods? Well, that's not possible 
with the kmeans() implementation. See ?kmeans which tells you:


   The data given by x is clustered by the k-means algorithm. When this
   terminates, all cluster centres are at the mean of their Voronoi sets
   (the set of data points which are nearest to the cluster centre).

   The algorithm of Hartigan and Wong (1979) is used.


Of course, you can do some projection based on the calculation of 
distances, but I don't think there are functions available to do that 
completely automatical - and interpretation of results won't be that 
easy ...

Uwe Ligges



> 
>>>Thanks in advance,
>>>Nicolas BOUGET
>>
>>As the name says, kmeans() calculates *means* (centres) of
> 
> clusters. It 
> 
>>does not any make sense to do that on distances ...
>>
>>Uwe Ligges
>>
>>
>>
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://www.stat.
> 
> math.ethz.ch/mailman/listinfo/r-help
> 
>>>PLEASE do read the posting guide!
> 
> http://www.R-project.org/posting-guide.html
> 
>>
> 
> 
> 
>




More information about the R-help mailing list