[R] statistical significance test for cluster agreement

Alexander Sirotkin [at Yahoo] alex_s_42 at yahoo.com
Wed Mar 24 23:03:47 CET 2004


Christian,

I think I understand your point, but I do not
completely agree with you. I also did not describe 
my problem clear enough.

> If you see two
> clusterings on the same
> data, they are identical, if they are 100%
> identical, and if not, then
> not. 

What you are actually saying is that all values of 
Rand index for cluster agreement other then 1 
inidicate that clusters do not agree. I believe
that many people would disagree with this statement.

Let me explain my problem in a little bit more detail.

I have some classified data set. These classes were 
ontained using non-statistical methods. What I'm
trying
to do is run some clustering algorithm and compare
it's results to this known classification.

I think that this is not very different from
calculating mean and comparing it to some known value.

I think that is should be theoretically possible to
use
Rand index as a test statistic. 

Or maybe I'm missing something...




More information about the R-help mailing list