[R] Help with 2-D plot of k-mean clustering analysis

Peter Langfelder peter.langfelder at gmail.com
Wed May 18 17:25:04 CEST 2011


On Wed, May 18, 2011 at 7:41 AM, Meng Wu <mengwu1002 at gmail.com> wrote:
> Hi, all
>
>  I would like to use R to perform k-means clustering on my data which
> included 33 samples measured with ~1000 variables. I have already used
> kmeans package for this analysis, and showed that there are 4 clusters in my
> data. However, it's really difficult to plot this cluster in 2-D format
> since the "huge" number of variables. One possible way is to project the
> multidimensional space into 2-D platform, but I could not find any good way
> to do that. Any suggestions or comments will be really helpful!

You could use multidimensional scaling, function cmdscale(), to
produce a 2-dimensional representation of your data, then plot it
using colors that correspond to the clusters.

For example, suppose your data is stored in matrix X (1000x33). I
assume you clustered the samples, not the variables, so you have a
vector label[] with length 33 that has values between 1 and 4. Since
k-means uses Euclidean distance, you would re-create the distance

dst = dist(t(X))

then feed it into cmdscale()

mds = cmdscale(dst);

then plot it:

plot(mds, col = label)

HTH,

Peter



More information about the R-help mailing list