[R] Kmeans cluster analysis
pisicandru at hotmail.com
Wed Apr 11 15:31:13 CEST 2007
As far as i know there is a package called clustTool which has a very nice
interface with the capability to do different cluster analyses. It also
prodused a plot of each cluster and the mean for each cluster of each
variable - and i guess this is what you are after! But depending of which
parameters you are using for the cluster analysis, the package is extremely
slow if you have more than 5000 datapoints. Maybe you can take the function
apart to see where and what generates the plot and use that for your
I hope this helps,
Date: Tue, 10 Apr 2007 19:51:24 +0000 (GMT)
From: nathaniel Grey <nathaniel.grey at yahoo.co.uk>
Subject: [R] Kmeans cluster analysis
To: r-help at stat.math.ethz.ch
Message-ID: <352480.52445.qm at web23402.mail.ird.yahoo.com>
I have a data-set containing 22 variables, after appropriate
transformations etc I ran a
kmeans cluster analysis for 4 clusters , I ran it 20 times to find a result
with the lowest
within sum of squares.
My question is how best do I go about finding out what the characteristics
are of each cluster?
Is one cluster dominated by a particular set of variables or by a particular
The only way I know is to to look at the means for each variable for each
cluster, but as there
are 22 variables this is time consuming.
Is there a way to graphically represent the clusters in relation to the
variables...if so I
might need some guidance on the coding as I am new to the R environment.
Any advice and direction would be gratefully received.
More information about the R-help