[BioC] Best options for cross validation machine learning

Tue Jan 19 18:09:08 CET 2010

On Tue, Jan 19, 2010 at 11:11 AM, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote:
> Hello,

Hi, Dan.

> I have a microarray dataset which I have performed an unsupervised
> Bayesian clustering algorithm on which divides the samples into four
> groups.  What I would like to do is:
> 1) Pick a group of genes that best predict which group a sample belongs to.

Feature selection....

> 2) Determine how stable these prediction sets are through some sort of
> cross-validation (I would prefer not to divide my set into a training
> and test set for stage one)

Cross-validation....

Note that for cross-validation, steps 1 and 2 necessarily need to be
done together.

> These steps fall into the supervised machine learning realm which I am
> not familiar with and googling around the options seem endless.  I was
> wondering whether anyone could suggest reasonable well-established
> algorithms to use for both steps.

Check out the MLInterfaces package.  There are MANY methods that could
be applied.  It really isn't possible to boil this down to an email
answer, unfortunately.

Sean