[R] Kmeans

Ranjan Maitra maitra at iastate.edu
Wed Mar 28 18:07:05 CEST 2007


The answer is correct, and is the way it should be. Cluster indicators are only nominal: there is no ordering in the ids, and hence the means are in different order.

Either way, the reason for this (and you could get totally different answers also) is because of how R initializes kmeans (random starts) which is clearly explained in the help on the subject.

Ranjan

On Wed, 28 Mar 2007 17:48:18 +0200 "Sergio Della Franca" <sergio.della.franca at gmail.com> wrote:

> Dear R-Helpers,
> 
> I performed kmeans clustering on the following data set(y):
> 
>  YEAR  PRODUCTS
>     1          10
>     2          42
>     3          25
>     4          42
>     5          40
>     6          45
>     7          44
>     8          47
>     9          42
> 
> 
> with this code:
> 
> cluster<-kmeans(y[,c("YEAR","PRODUCTS")],3).
> 
> Every time i run this code the components of cluster ("mean"  "vector")
> changed value,i.e.
> 
> First run:
> 
> Cluster means
> 
>       YEAR  PRODUCTS
> 1 7.500000 44.50000
> 2 3.666667 41.33333
> 3 2.000000 17.50000
> 
> Clustering vector:
> 1 2 3 4 5 6 7 8 9
> 3 2 3 2 2 1 1 1 1
> 
> Second run:
> Cluster means
>       YEAR  PRODUCTS
> 1 2.000000 17.50000
> 2 3.666667 41.33333
> 3 7.500000 44.50000
> 
> Clustering vector:
> 1 2 3 4 5 6 7 8 9
> 1 2 1 2 2 3 3 3 3
> 
> 
> How can i modify, if it is possible, the code to obtain the same value
> ("mean"  "vector") every time i'll run the code?
> 
> Thank you in advance.
> 
> Sergio Della Franca.
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list