[R] K-means results understanding!!!

Mon Jun 24 11:25:21 CEST 2013

Dear  members.

I am having problems to understand the kmeans- results in R. I am applying
kmeans-algorithms to my big data file, and it is producing the results of
the clusters.

Q1) Does anybody knows how to find out in which cluster (I have fixed
numberofclusters = 5 ) which data have been used?
COMMAND
(kmeans.results <- kmeans(mydata,centers =5, iter.max= 1000, nstart =10000))

Q2) When I call kmeans.results I have the following output: 

K-means clustering with 5 clusters of sizes 17, 1, 6, 4, 32

Cluster means:
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]     [,11]        [,12]
1    0    0    0    0    0    0    0    0    0     0 0.0000000 0.0008235294
2    0    0    0    0    0    0    0    0    0     0 0.0000000 0.0000000000
3    0    0    0    0    0    0    0    0    0     0 0.0000000 0.0000000000
4    0    0    0    0    0    0    0    0    0     0 0.0000000 0.0040000000
5    0    0    0    0    0    0    0    0    0     0 0.0003125 0.0003750000
         [,13]       [,14]       [,15]       [,16]       [,17]      [,18]
1 0.0008235294 0.001176471 0.005176471 0.012471295 0.041181652 0.10663935
2 0.0000000000 0.000000000 0.000000000 0.000000000 0.169491525 0.61016949
3 0.0000000000 0.000000000 0.000000000 0.002333333 0.006666667 0.07695015
4 0.0030000000 0.001500000 0.001000000 0.017500000 0.029000000 0.06150000
5 0.0015625000 0.003437500 0.010687500 0.046375000 0.100062500 0.14306250
       [,19]     [,20]     [,21]     [,22]      [,23]      [,24]       [,25]
1 0.12946535 1.0017347 0.3360283 0.2455259 0.08565672 0.02553212 0.006000000
2 0.94915254 0.1694915 0.1016949 0.0000000 0.00000000 0.00000000 0.000000000
3 0.09376439 1.3857837 0.2659812 0.1015707 0.03804953 0.02023362 0.007666667
4 0.17100000 0.6665000 0.7860000 0.1860000 0.04650000 0.01450000 0.012000000
5 0.18100000 0.5200625 0.4156875 0.3461250 0.16925000 0.04918750 0.011500000
         [,26]       [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35]
1 0.0005882353 0.001176471     0     0     0     0     0     0     0     0
2 0.0000000000 0.000000000     0     0     0     0     0     0     0     0
3 0.0010000000 0.000000000     0     0     0     0     0     0     0     0
4 0.0000000000 0.000000000     0     0     0     0     0     0     0     0
5 0.0013125000 0.000000000     0     0     0     0     0     0     0     0
  [,36] [,37] [,38] [,39] [,40]
1     0     0     0     0     0
2     0     0     0     0     0
3     0     0     0     0     0
4     0     0     0     0     0
5     0     0     0     0     0

Clustering vector:
 [1] 1 5 5 3 1 5 5 5 5 1 4 1 5 5 5 5 4 5 2 3 5 5 1 5 5 5 5 1 3 1 4 5 5 1 5 5
5 1
[39] 3 1 5 5 3 1 1 1 1 5 5 1 4 1 3 5 5 5 5 5 5 1

Within cluster sum of squares by cluster:
[1] 0.6702803 0.0000000 0.2453294 0.1860180 1.3535263
 (between_SS / total_SS =  76.8 %)

Available components:

[1] "cluster"      "centers"      "totss"        "withinss"    
"tot.withinss"
[6] "betweenss"    "size"        
> 
Q3)I would like to understand which raw data are in which cluster ?  Does
somebody knows how to access the table of raw data which are in the same
cluster ?

Thanks for help
DZU

--
View this message in context: http://r.789695.n4.nabble.com/K-means-results-understanding-tp4670171.html
Sent from the R help mailing list archive at Nabble.com.