[R] Clustering algorithms don't find obvious clusters

Cuvelier Etienne ecuscimail at gmail.com
Fri Jun 11 13:51:56 CEST 2010



Le 11/06/2010 12:45, Henrik Aldberg a écrit :
> I have a directed graph which is represented as a matrix on the form
>
>
> 0 4 0 1
>
> 6 0 0 0
>
> 0 1 0 5
>
> 0 0 4 0
>
>
> Each row correspond to an author (A, B, C, D) and the values says how many
> times this author have cited the other authors. Hence the first row says
> that author A have cited author B four times and author D one time. Thus the
> matrix represents two groups of authors: (A,B) and (C,D) who cites each
> other. But there is also a weak link between the groups. In reality this
> matrix is much bigger and very sparce but it still consists of distinct
> groups of authors.
>
>
> My problem is that when I cluster the matrix using pam, clara or agnes the
> algorithms does not find the obvious clusters. I have tried to turn it into
> a dissimilarity matrix before clustering but that did not help either.
>
>
> The layout of the clustering is not that important to me, my primary
> interest is the to get the right nodes into the right clusters.
>
>
>
>    
Hello Henrik,
You can use a graph clustering using the igraph package.
Example:

library(igraph)
simM<-NULL
simM<-rbind(simM,c(0, 4, 0, 1))
simM<-rbind(simM,c(6, 0, 0, 0))
simM<-rbind(simM,c(0, 1, 0, 5))
simM<-rbind(simM,c(0, 0, 4, 0))
G <- graph.adjacency( simM,weighted=TRUE,mode="directed")
plot(G,layout=layout.kamada.kawai)

### walktrap.community
wt <- walktrap.community(G, modularity=TRUE)
wmemb <- community.to.membership(G, wt$merges,
                                 steps=which.max(wt$modularity)-1)

V(G)$color <- rainbow(3)[wmemb$membership+1]
plot(G)

I hope  it helps

Etienne

> Sincerely
>
>
> Henrik
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list