[R] K-means results understanding!!!

David Carlson dcarlson at tamu.edu
Mon Jun 24 17:00:40 CEST 2013


You should read the help page

?kmeans

Especially the section labeled "Value" which tells you what kmeans
returns. You will see that the cluster membership is returned as a
vector of integers called "cluster." If you don't know how to access
that from kmeans.results, you haven't read any of the basic
tutorials on R.

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Dzu
Sent: Monday, June 24, 2013 4:25 AM
To: r-help at r-project.org
Subject: [R] K-means results understanding!!!

Dear  members.

I am having problems to understand the kmeans- results in R. I am
applying
kmeans-algorithms to my big data file, and it is producing the
results of
the clusters.

Q1) Does anybody knows how to find out in which cluster (I have
fixed
numberofclusters = 5 ) which data have been used?
COMMAND
(kmeans.results <- kmeans(mydata,centers =5, iter.max= 1000, nstart
=10000))

Q2) When I call kmeans.results I have the following output: 


K-means clustering with 5 clusters of sizes 17, 1, 6, 4, 32

Cluster means:
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]     [,11]
[,12]
1    0    0    0    0    0    0    0    0    0     0 0.0000000
0.0008235294
2    0    0    0    0    0    0    0    0    0     0 0.0000000
0.0000000000
3    0    0    0    0    0    0    0    0    0     0 0.0000000
0.0000000000
4    0    0    0    0    0    0    0    0    0     0 0.0000000
0.0040000000
5    0    0    0    0    0    0    0    0    0     0 0.0003125
0.0003750000
         [,13]       [,14]       [,15]       [,16]       [,17]
[,18]
1 0.0008235294 0.001176471 0.005176471 0.012471295 0.041181652
0.10663935
2 0.0000000000 0.000000000 0.000000000 0.000000000 0.169491525
0.61016949
3 0.0000000000 0.000000000 0.000000000 0.002333333 0.006666667
0.07695015
4 0.0030000000 0.001500000 0.001000000 0.017500000 0.029000000
0.06150000
5 0.0015625000 0.003437500 0.010687500 0.046375000 0.100062500
0.14306250
       [,19]     [,20]     [,21]     [,22]      [,23]      [,24]
[,25]
1 0.12946535 1.0017347 0.3360283 0.2455259 0.08565672 0.02553212
0.006000000
2 0.94915254 0.1694915 0.1016949 0.0000000 0.00000000 0.00000000
0.000000000
3 0.09376439 1.3857837 0.2659812 0.1015707 0.03804953 0.02023362
0.007666667
4 0.17100000 0.6665000 0.7860000 0.1860000 0.04650000 0.01450000
0.012000000
5 0.18100000 0.5200625 0.4156875 0.3461250 0.16925000 0.04918750
0.011500000
         [,26]       [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34]
[,35]
1 0.0005882353 0.001176471     0     0     0     0     0     0     0
0
2 0.0000000000 0.000000000     0     0     0     0     0     0     0
0
3 0.0010000000 0.000000000     0     0     0     0     0     0     0
0
4 0.0000000000 0.000000000     0     0     0     0     0     0     0
0
5 0.0013125000 0.000000000     0     0     0     0     0     0     0
0
  [,36] [,37] [,38] [,39] [,40]
1     0     0     0     0     0
2     0     0     0     0     0
3     0     0     0     0     0
4     0     0     0     0     0
5     0     0     0     0     0

Clustering vector:
 [1] 1 5 5 3 1 5 5 5 5 1 4 1 5 5 5 5 4 5 2 3 5 5 1 5 5 5 5 1 3 1 4 5
5 1 5 5
5 1
[39] 3 1 5 5 3 1 1 1 1 5 5 1 4 1 3 5 5 5 5 5 5 1

Within cluster sum of squares by cluster:
[1] 0.6702803 0.0000000 0.2453294 0.1860180 1.3535263
 (between_SS / total_SS =  76.8 %)

Available components:

[1] "cluster"      "centers"      "totss"        "withinss"    
"tot.withinss"
[6] "betweenss"    "size"        
> 
Q3)I would like to understand which raw data are in which cluster ?
Does
somebody knows how to access the table of raw data which are in the
same
cluster ?

Thanks for help
DZU



--
View this message in context:
http://r.789695.n4.nabble.com/K-means-results-understanding-tp467017
1.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list