[BioC] Hier.Clustering: group size effect
Kevin R. Coombes
krc at mdacc.tmc.edu
Fri Feb 3 15:47:57 CET 2006
If you already know the groups, then what's the point of doing
clustering? More precisely, what biological question do you think you
are answering with this method?
Kevin
Heike Pospisil wrote:
> Hello,
>
> I have a question concerning hierarchical clustering and the effect of group sizes.
>
> I would like to select genes that are differentially expressed between group A
> and group B. Afterwards, I wish to cluster the samples by these genes. In
> principle, it works fine, but I have a problem if the group sizes are
> significantly unequal. One example is as e.g.:
> group A: 53 samples
> group B: 12 samples
> The resulting clustering brings group B together, but it is not clearly
> separated from group A. Then again, if I take 12 samples from group A randomly
> (to get equal group sizes), the clustering is nearly perfect.
>
> I use hclust(dist(t(exprs(sub)),method="euclidean"),method="complete")
> (ncol(sub) = groupA+groupB and nrow(sub) = number of sign.genes) and tried other
> distance measures, but without improvement.
>
> Does anybody have a hint which clustering algorithm should be prefered for such
> unequal group sizes?
>
> Thanks in advance and best wishes,
> Heike
More information about the Bioconductor
mailing list