[R] clustering fuzzy
pete
pieroleone at hotmail.it
Wed Feb 2 19:14:10 CET 2011
After ordering the table of membership degrees , i must get the difference
between the first and second coloumns , between the first and second largest
membership degree of object i. This for K=2,K=3,....to K.max=6.
This difference is multiplyed by the Crisp silhouette index vector (si). Too
it dependending on K=2,...,K.max=6; the result divided by the sum of these
differences
I need a final vector composed of the indexes for each clustering
(K=2,...,K.max=6).
There is a method, i think that is classe.memb, but i can't to solve problem
because trasformation of the membership degrees matrix( (ris$membership) and
of list object (ris$silinfo), does not permit me to use classe.memb
propertyes.
.
Σí(uί1-uí2)sí/Σí(uí1-uí2)
> head(t(A.sort)) membership degrees table ordering by max to min value
[,1] [,2] [,3] [,4]
1 0.66 0.30 0.04 0.01
2 0.89 0.09 0.02 0.00
3 0.92 0.06 0.01 0.01
4 0.71 0.21 0.07 0.01
5 0.85 0.10 0.04 0.01
6 0.91 0.04 0.02 0.02
> head(t(A.sort))
[,1] [,2] [,3] [,4]
1 0.66 0.30 0.04 0.01
2 0.89 0.09 0.02 0.00
3 0.92 0.06 0.01 0.01
4 0.71 0.21 0.07 0.01
5 0.85 0.10 0.04 0.01
6 0.91 0.04 0.02 0.02
> H.Asort=head(t(A.sort))
> H.Asort[,1]-H.Asort[,2]
1 2 3 4 5 6
0.36 0.80 0.86 0.50 0.75 0.87
> H.Asort=t(H.Asort[,1]-H.Asort[,2])
This is the differences vector by multiplying trasformed table ris$silinfo.
> ris$silinfo
$widths
cluster neighbor sil_width
72 1 3 0.43820207
54 1 3 0.43427773
29 1 6 0.41729079
62 1 6 0.40550562
64 1 6 0.32686757
32 1 3 0.30544722
45 1 3 0.30428723
79 1 3 0.30192624
12 1 3 0.30034472
60 1 6 0.29642495
41 1 3 0.29282778
1 1 3 0.28000788
85 1 3 0.24709237
74 1 3 0.239
> P=ris$silinfo
> P=P[1]
> P=as.data.frame(P)
> V4=rownames(P)
> mode(V4)="numeric"
> P[,4]=V4
> P[order(P$V4),]
widths.cluster widths.neighbor widths.sil_width V4
1 1 3 0.28000788 1
2 2 4 0.07614849 2
3 2 3 -0.11676440 3
4 2 4 0.15436648 4
5 2 3 0.14693927 5
6 3 1 0.57083836 6
7 4 5 0.36391826 7
8 5 4 0.63491118 8
9 4 2 0.54458733 9
10 5 4 0.51059626 10
11 2 5 0.03908952 11
12 1 3 0.30034472 12
13 1 3 -0.04928562 13
14 4 3 0.20337180 14
15 3 4 0.46164324 15
18 5 4 0.52066782 18
20 4 3 0.45517287 20
21 3 4 0.39405507 21
22 4 5 0.05574547 22
23 6 1 -0.06750403 23
> P= P[order(P$V4),]
P=P[,3]
This is trasformed vector ris$silinfo =P.
I can't to use this vector object in the classe.memb.
K=2
K.max=6
while (K<=K.max)
{
ris=fanny(frj,K,memb.exp=m,metric="SqEuclidean",stand=TRUE,maxit=1000,tol=1e-6)
ris$centroid=matrix(0,nrow=K,ncol=J)
for (k in 1:K)
{
ris$centroid[k,]=(t(ris$membership[,k]^m)%*%as.matrix(frj))/sum(ris$membership[,k]^m)
}
rownames(ris$centroid)=1:K
colnames(ris$centroid)=colnames(frj)
print(K)
print(round(ris$centroid,2))
print(classe.memb(ris$membership)$table.U)
print(ris$silinfo$avg.width)
K=K+1
}
this should be scheme clearly are determined centroid based on classe.memb.
classe.memb=function(U)
{
info.U=cbind(max.col(U),apply(U,1,max))
i=1
while (i <= nrow(U))
{
if (apply(U,1,max)[i]<0.5) info.U[i,1]=0
i=i+1
}
K=ncol(U)
table.U=matrix(0,nrow=K,ncol=4)
cl=1
while (cl <= K)
{
table.U[cl,1] = length(which(info.U[info.U[,1]==cl,2]>=.90))
table.U[cl,2] = length(which(info.U[info.U[,1]==cl,2]>=.70)) -
table.U[cl,1]
table.U[cl,3] = length(which(info.U[info.U[,1]==cl,2]>=.50)) -
table.U[cl,1] - table.U[cl,2]
table.U[cl,4] = sum(table.U[cl,])
cl = cl+1
}
rownames(table.U) = c(1:K)
colnames(table.U) = c("Alto", "Medio", "Basso", "Totale")
out=list()
out$info.U=round(info.U,2)
out$table.U=table.U
return(out)
}
--
View this message in context: http://r.789695.n4.nabble.com/clustering-fuzzy-tp3229853p3255223.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list