[R] K-means results understanding!!!
David Carlson
dcarlson at tamu.edu
Mon Jun 24 17:00:40 CEST 2013
You should read the help page
?kmeans
Especially the section labeled "Value" which tells you what kmeans
returns. You will see that the cluster membership is returned as a
vector of integers called "cluster." If you don't know how to access
that from kmeans.results, you haven't read any of the basic
tutorials on R.
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Dzu
Sent: Monday, June 24, 2013 4:25 AM
To: r-help at r-project.org
Subject: [R] K-means results understanding!!!
Dear members.
I am having problems to understand the kmeans- results in R. I am
applying
kmeans-algorithms to my big data file, and it is producing the
results of
the clusters.
Q1) Does anybody knows how to find out in which cluster (I have
fixed
numberofclusters = 5 ) which data have been used?
COMMAND
(kmeans.results <- kmeans(mydata,centers =5, iter.max= 1000, nstart
=10000))
Q2) When I call kmeans.results I have the following output:
K-means clustering with 5 clusters of sizes 17, 1, 6, 4, 32
Cluster means:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[,12]
1 0 0 0 0 0 0 0 0 0 0 0.0000000
0.0008235294
2 0 0 0 0 0 0 0 0 0 0 0.0000000
0.0000000000
3 0 0 0 0 0 0 0 0 0 0 0.0000000
0.0000000000
4 0 0 0 0 0 0 0 0 0 0 0.0000000
0.0040000000
5 0 0 0 0 0 0 0 0 0 0 0.0003125
0.0003750000
[,13] [,14] [,15] [,16] [,17]
[,18]
1 0.0008235294 0.001176471 0.005176471 0.012471295 0.041181652
0.10663935
2 0.0000000000 0.000000000 0.000000000 0.000000000 0.169491525
0.61016949
3 0.0000000000 0.000000000 0.000000000 0.002333333 0.006666667
0.07695015
4 0.0030000000 0.001500000 0.001000000 0.017500000 0.029000000
0.06150000
5 0.0015625000 0.003437500 0.010687500 0.046375000 0.100062500
0.14306250
[,19] [,20] [,21] [,22] [,23] [,24]
[,25]
1 0.12946535 1.0017347 0.3360283 0.2455259 0.08565672 0.02553212
0.006000000
2 0.94915254 0.1694915 0.1016949 0.0000000 0.00000000 0.00000000
0.000000000
3 0.09376439 1.3857837 0.2659812 0.1015707 0.03804953 0.02023362
0.007666667
4 0.17100000 0.6665000 0.7860000 0.1860000 0.04650000 0.01450000
0.012000000
5 0.18100000 0.5200625 0.4156875 0.3461250 0.16925000 0.04918750
0.011500000
[,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34]
[,35]
1 0.0005882353 0.001176471 0 0 0 0 0 0 0
0
2 0.0000000000 0.000000000 0 0 0 0 0 0 0
0
3 0.0010000000 0.000000000 0 0 0 0 0 0 0
0
4 0.0000000000 0.000000000 0 0 0 0 0 0 0
0
5 0.0013125000 0.000000000 0 0 0 0 0 0 0
0
[,36] [,37] [,38] [,39] [,40]
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 0 0 0 0 0
Clustering vector:
[1] 1 5 5 3 1 5 5 5 5 1 4 1 5 5 5 5 4 5 2 3 5 5 1 5 5 5 5 1 3 1 4 5
5 1 5 5
5 1
[39] 3 1 5 5 3 1 1 1 1 5 5 1 4 1 3 5 5 5 5 5 5 1
Within cluster sum of squares by cluster:
[1] 0.6702803 0.0000000 0.2453294 0.1860180 1.3535263
(between_SS / total_SS = 76.8 %)
Available components:
[1] "cluster" "centers" "totss" "withinss"
"tot.withinss"
[6] "betweenss" "size"
>
Q3)I would like to understand which raw data are in which cluster ?
Does
somebody knows how to access the table of raw data which are in the
same
cluster ?
Thanks for help
DZU
--
View this message in context:
http://r.789695.n4.nabble.com/K-means-results-understanding-tp467017
1.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list