[R] Question about PAM clustering method

Fri Apr 18 10:57:15 CEST 2003

Hello everyone.  I just started learning R for clustering analysis in my 
research project.  I tried k-means method and PAM method, both of which 
were properly processed with my data.  I have some questions about PAM 
graphical output.  

Suppose to do the commands shown below;
pm<-pam(D, 6)
plot(pm)

I got two charts after prompted.  In the first chart, 6 oval clusters are 
drawn together with data markers.  I see four 'pink' lines that connect 
oval clusters.  In this case, oval clusters are located very near, and some 
of them are overlapped.  The line starts from the edge of one oval, and it 
ends at the edge of another oval.  Does anyone know the meaning of this 
line?  I imagine that the line shows close linkage of the corresponding 
clusters, but no comments regarding this line can be found in the help 
documents.  

Second question is the meaning of the comment "these two components explain 
x% of the point variability" at the bottom oh the graph.  In my case, the 
data has 6 (groups) x 20 (properties) dimension.  I think that R extract 
the first and the second factors, and map them on the graph.  Therefore, 
the number is the total contribution of those two factors.  Am I correct?  
If so, how can I choose the factors other than the first or the second?  

Lastly, I read a document that says about the average silhouette, "even 
that highest width is below (say) 0.25, one may conclude that no 
substantial structure has been found".  Is this true?  In my case, the 
value is far below 0.25, possibly because some clusters overlap on the 
graph.  I can accept the overlapping clusters from the viewpoint of my 
research, but I wonder if the PAM method is also useful for these clusters. 

Thank you very much for your help in advance.  

T. Isog
Tokyo, Japan

_________________________________________________________________
きっと見つかるあなたの新居  不動産情報は MSN 住宅で   
http://house.msn.co.jp/