[R] Examining how cases are similar by cluster, in cluster analysis
Bob Green
bgreen at dyson.brisnet.org.au
Sun Nov 18 12:00:16 CET 2012
Hello,
I used the following code to perform a cluster analysis on a
dataframe consisting of 12 variables (coded as 1,0) and 63 cases.
FS1 <- read.csv("D://Arsontest2.csv",header=T,row.names=1)
str(FS1)
dmat <- dist(FS1, method="binary")
cl.test <- hclust (dist(FS1, method ="binary"), "ave")
plot(cl.test, hang = -1)
Each case has an id and the dendogram identifies the respective cases
which constitute each cluster. What I am seeking advice on is how to
examine the variables on which the cases are similar, within each cluster.
sort (hcli8 <- cutree(cl.test, k=8)) identifies that the following
cluster 2is comprised of the following cases:
1641 2295 2594 2654 2799 3213 3510 3513 2958 3294
2 2 2 2 2 2 2 2
2 2
This code provides means for the variables by cluster. In relation to
cluster 2 it appears the cases should have no clear motive and be depressed :
round(sapply(x, function(i) colMeans(FS1[i,])),2)
[,1] [,2] [,3] [ ,4] [,5] [,6] [,7] [,8]
depressed 0.00 0.33 0.00 0.0 0 0.6 0.00 0.08
unclear 0.33 1.00 1.00 1.0 0 0.0 0.07 0.12
I can manually, examine this variable by variable and look at how
each of the cases in cluster 2 are similar on the variables. I am
looking at a more efficient and quicker way to do this.
Bob
More information about the R-help
mailing list