[R] difference between trees in R?
Mark Robinson
m.robinson at utoronto.ca
Tue Aug 21 16:09:27 CEST 2001
Hi.
I am wondering if anybody has studied and/or written code in R to
calculate the distance between 2 "trees". For example, if one does a
hierarchical agglomerative clustering and say, a hierachical divisive
clustering (represented as trees) and wishes to compute a metric on
them. I am thinking of something like the symmetric difference as
mentioned in Margush and McMorris (1982).
My application is actually a bit different than that above so I'll
describe it. I actually want to combine numerous k-means
classifications into 1. Because subsequent runs of the the k-means
procedure are going to give different cluster memberships (because of
different starting points), I wanted to run it a bunch of times and
combine it into a consensus. But to do that, I wanted to quantify how
different a consensus of , for example, 3 k-mean runs is from a
consensus of 4 k-mean runs (denoted here by d(3,4)).
Presumably, the sequence d(3,4), d(4,5), ..., d(p,p+1) would keep
decreasing and at some point I would be satisfied that no further k-mean
runs to add to the consensus would be necessary.
I thought I could represent a k-means run as a binary tree or do a
hierarchical agglomerative clustering of a matrix of cluster memberships
(1s and 0s) from p k-mean runs but maybe this isn't the best approach.
So, is there a metric on two consensuses of k-mean runs? Or another
approach that I can implement in R.
Many thanks for your suggestions.
M.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list