[R] kmeans clustering
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Apr 14 13:04:20 CEST 2003
On Mon, 14 Apr 2003, pingzhao wrote:
> Hi,
>
> I am using kmeans to cluster a dataset.
> I test this example:
>
> > data<-matrix(scan("data100.txt"),100,37,byrow=T)
> (my dataset is 100 rows and 37 columns--clustering rows)
>
> > c1<-kmeans(data,3,20)
> > c1
> $cluster
> [1] 1 1 1 1 1 1 1 3 3 3 1 3 1 3 3 1 1 1 1 3 1 3 3 1 1 1 3 3 1 1 3 1 1 1 1 3
> 3
> [38] 3 1 1 1 3 1 1 1 1 3 3 3 1 1 1 1 1 1 3 1 3 1 1 3 1 1 1 1 3 1 1 1 1 1 1 3
> 1
> [75] 1 3 1 3 1 1 1 1 3 1 1 1 1 1 3 1 1 3 1 1 3 3 1 2 1 1
>
> $withinss
> [1] 1037.5987 0.0000 666.9701
>
> $size
> [1] 68 1 31
>
> > c4<-kmeans(data,3,20)
> $withinss
> [1] 0.0000 865.7628 851.1214
>
> $size
> [1] 1 54 45
>
> Does any one tell me why the results are very different with the same
> dataset and parameters when I run some times this command
> 'kmeans(data,3,20)'???
The help page could tell you:
centers: Either the number of clusters or a set of initial cluster
centers. If the first, a random set of rows in `x' are chosen
as the initial centers.
At the very least, the labellings of the clusters are arbitrary, but
K-means usually has multiple local minima.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list