Hi:
I have tried to find the appropriate clust algorithm for mixed type of data.
The suggested way I see is:

1.       use daisy to get the dissimilarity matrix

2.       use PAM/hclust by providing the dissimilarity matrix, to get the clusters
but by following this, when the data set grows bigger say 10,000 rows of data, the dissimilarity matrix will be O(n^2), and out of memory will occur.
I am wondering is there any better ways to do the mixed type cluster?

Cheng Yi


	[[alternative HTML version deleted]]

