[R] Question about PAM clustering method
Isogai Takashi
t_isog at hotmail.com
Fri Apr 18 10:57:15 CEST 2003
Hello everyone. I just started learning R for clustering analysis in my
research project. I tried k-means method and PAM method, both of which
were properly processed with my data. I have some questions about PAM
graphical output.
Suppose to do the commands shown below;
pm<-pam(D, 6)
plot(pm)
I got two charts after prompted. In the first chart, 6 oval clusters are
drawn together with data markers. I see four 'pink' lines that connect
oval clusters. In this case, oval clusters are located very near, and some
of them are overlapped. The line starts from the edge of one oval, and it
ends at the edge of another oval. Does anyone know the meaning of this
line? I imagine that the line shows close linkage of the corresponding
clusters, but no comments regarding this line can be found in the help
documents.
Second question is the meaning of the comment "these two components explain
x% of the point variability" at the bottom oh the graph. In my case, the
data has 6 (groups) x 20 (properties) dimension. I think that R extract
the first and the second factors, and map them on the graph. Therefore,
the number is the total contribution of those two factors. Am I correct?
If so, how can I choose the factors other than the first or the second?
Lastly, I read a document that says about the average silhouette, "even
that highest width is below (say) 0.25, one may conclude that no
substantial structure has been found". Is this true? In my case, the
value is far below 0.25, possibly because some clusters overlap on the
graph. I can accept the overlapping clusters from the viewpoint of my
research, but I wonder if the PAM method is also useful for these clusters.
Thank you very much for your help in advance.
T. Isog
Tokyo, Japan
_________________________________________________________________
きっと見つかるあなたの新居 不動産情報は MSN 住宅で
http://house.msn.co.jp/
More information about the R-help
mailing list