[BioC] Re: [S] Error in clustering procedure

Mon Sep 13 21:25:36 CEST 2004

"Dimension reduction" brings up another important issue:
I had discussions with quite a few scientists who believe
that dimension reduction is not allowed, since you are
loosing worthwile information.

With respect to gene expression I believe hat it makes
sense to filter first non-variant genes to reduce the
number of dimensions.

But..., these people are using hierarchical clustering
to cluster chemical compound libraries in "chemical space",
and there are no compounds to eliminate.

So, another question is, which method would be best to
cluster about one million compounds in chemical space in
order to be able reduce the number of compounds used in
screening by selecting only representative members of a
certain cluster.

Best regards
Christian

michael watson (IAH-C) wrote:
> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] 
> 
> 
>>But MDS-like methods (note, not algorithms) are better for your stated 
>>purpose.
> 
> 
> Hi
> 
> Just thinking out-loud here, which can be a painful process...
> 
> So MDS/PCA is an exercise in dimension reduction.  Therefore, if we
> reduce the dimensionality of the dataset to few(er) dimensions which
> explain most of the variability, then order the data set by those
> dimensions, then that will place together genes (in the list) which are
> behaving similarly - is that what you are suggesting?
> 
>