[R] clustering fuzzy

pete pieroleone at hotmail.it
Wed Feb 2 19:14:10 CET 2011


After ordering the table of membership degrees , i must get the difference
between the first and second coloumns , between the first and second largest
membership degree of object i. This for K=2,K=3,....to K.max=6. 
This difference is multiplyed by the Crisp silhouette index vector (si). Too
it dependending on K=2,...,K.max=6; the result divided by the sum of these
differences 
 I need a final vector composed of the indexes for each clustering
(K=2,...,K.max=6). 
There is a method, i think that is classe.memb, but i can't to solve problem
because trasformation of the membership degrees matrix( (ris$membership) and
of  list object (ris$silinfo), does not permit    me to use classe.memb
propertyes. 
. 

Σí(uί1-uí2)sí/Σí(uí1-uí2) 


> head(t(A.sort))     membership degrees table ordering by max to min value 
  [,1] [,2] [,3] [,4] 
1 0.66 0.30 0.04 0.01 
2 0.89 0.09 0.02 0.00 
3 0.92 0.06 0.01 0.01 
4 0.71 0.21 0.07 0.01 
5 0.85 0.10 0.04 0.01 
6 0.91 0.04 0.02 0.02 
> head(t(A.sort)) 
  [,1] [,2] [,3] [,4] 
1 0.66 0.30 0.04 0.01 
2 0.89 0.09 0.02 0.00 
3 0.92 0.06 0.01 0.01 
4 0.71 0.21 0.07 0.01 
5 0.85 0.10 0.04 0.01 
6 0.91 0.04 0.02 0.02 
> H.Asort=head(t(A.sort)) 
> H.Asort[,1]-H.Asort[,2] 
   1    2    3    4    5    6 
0.36 0.80 0.86 0.50 0.75 0.87 

> H.Asort=t(H.Asort[,1]-H.Asort[,2]) 
This is the differences vector by multiplying trasformed table ris$silinfo. 
> ris$silinfo 
$widths 
   cluster neighbor   sil_width 
72       1        3  0.43820207 
54       1        3  0.43427773 
29       1        6  0.41729079 
62       1        6  0.40550562 
64       1        6  0.32686757 
32       1        3  0.30544722 
45       1        3  0.30428723 
79       1        3  0.30192624 
12       1        3  0.30034472 
60       1        6  0.29642495 
41       1        3  0.29282778 
1        1        3  0.28000788 
85       1        3  0.24709237 
74       1        3  0.239 




> P=ris$silinfo 
> P=P[1] 
>  P=as.data.frame(P) 
>  V4=rownames(P) 
>  mode(V4)="numeric" 
>  P[,4]=V4 
>  P[order(P$V4),] 

   widths.cluster widths.neighbor widths.sil_width V4 
1               1               3       0.28000788  1 
2               2               4       0.07614849  2 
3               2               3      -0.11676440  3 
4               2               4       0.15436648  4 
5               2               3       0.14693927  5 
6               3               1       0.57083836  6 
7               4               5       0.36391826  7 
8               5               4       0.63491118  8 
9               4               2       0.54458733  9 
10              5               4       0.51059626 10 
11              2               5       0.03908952 11 
12              1               3       0.30034472 12 
13              1               3      -0.04928562 13 
14              4               3       0.20337180 14 
15              3               4       0.46164324 15 
18              5               4       0.52066782 18 
20              4               3       0.45517287 20 
21              3               4       0.39405507 21 
22              4               5       0.05574547 22 
23              6               1      -0.06750403 23 
> P= P[order(P$V4),] 

P=P[,3] 
 This is trasformed vector ris$silinfo =P. 
I can't to use this vector object in the classe.memb. 
K=2 
K.max=6 
while (K<=K.max) 
 { 
 
ris=fanny(frj,K,memb.exp=m,metric="SqEuclidean",stand=TRUE,maxit=1000,tol=1e-6) 
  ris$centroid=matrix(0,nrow=K,ncol=J) 
  for (k in 1:K) 
   { 
   
ris$centroid[k,]=(t(ris$membership[,k]^m)%*%as.matrix(frj))/sum(ris$membership[,k]^m) 
   } 
  rownames(ris$centroid)=1:K 
  colnames(ris$centroid)=colnames(frj) 
  print(K) 
  print(round(ris$centroid,2)) 
  print(classe.memb(ris$membership)$table.U) 
  print(ris$silinfo$avg.width) 
  K=K+1 
 } 
this should be scheme clearly are determined centroid based on classe.memb. 

classe.memb=function(U) 
{ 
 info.U=cbind(max.col(U),apply(U,1,max)) 
 i=1 
 while (i <= nrow(U)) 
  { 
   if (apply(U,1,max)[i]<0.5) info.U[i,1]=0 
   i=i+1 
  } 
 K=ncol(U) 
 table.U=matrix(0,nrow=K,ncol=4) 
 cl=1 
 while (cl <= K) 
  { 
   table.U[cl,1] = length(which(info.U[info.U[,1]==cl,2]>=.90)) 
   table.U[cl,2] = length(which(info.U[info.U[,1]==cl,2]>=.70)) -
table.U[cl,1] 
   table.U[cl,3] = length(which(info.U[info.U[,1]==cl,2]>=.50)) -
table.U[cl,1] - table.U[cl,2] 
   table.U[cl,4] = sum(table.U[cl,]) 
   cl = cl+1 
  } 
 rownames(table.U) = c(1:K) 
 colnames(table.U) = c("Alto", "Medio", "Basso", "Totale") 
 out=list() 
 out$info.U=round(info.U,2) 
 out$table.U=table.U 
 return(out) 
}
-- 
View this message in context: http://r.789695.n4.nabble.com/clustering-fuzzy-tp3229853p3255223.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list