[BioC] fix

Thu Dec 18 19:10:04 MET 2003

Stephen:

> I agree with some of WHAT you say CHAD, the PROBLEM is THAT MOST
> multiVARIATE methods are BUILt on top OF the marginal tests. FOR instance
> machine learning methods are based on gene subsets for each of k CROSS
> validations.

Right. I recognize that gene selection is a central component of many 
sequential data analysis
schemes-- "at stage 1" pick a set of genes which (by a selection scheme) 
show regulation in the
array experiments -- then at stage 2 you do something with that.

My comment is STILL that this is a bad approach.  I'm guilty of it, too.
We are focusing on the trees instead of the ecosystem -- and if we had 
better covariate
info/ knowledge of gene-connectedness we wouldnt be doing this.

Moreover, if what you are doing at stage 2-k is based on 'binning' of 
genes,
then a low frequency false positives at stage 1 will matter less, and so 
will slightly sub-optimal
single gene power.

> USE of the appropriate TEST (fold/T/F/cyber-T/etc..)for subset
> selection is IMHO the most IMPORTANT!! choice .
>  
>
Yes I agree.  Its just that THE FIXATION on this topic to the exclusion 
of what
seem to be scientificially relevant other topics is BOTH maddening and 
disheartening.

CAS