[BioC] how can i apply random forest to expression sets of dna microarray

Steve Lianoglou mailinglist.honeypot at gmail.com
Fri Mar 9 22:42:16 CET 2012


While you should follow Vincent's advice, you might want to first
think very carefully about what the error you are getting really

Let's see:

> Error in randomForest.default(m,y,...): Can not handle categorical predctors
> wth more than 32 categories.

The question you should ask yourself is (and one I asked you some time
ago) is: where are these categorical predictors coming from?

You are building a random forest from a bunch of real valued
predictors, right? ("gene" expression), so what the heck? Where are
the categorical variables coming from?

If I were you, I'd start w/ looking at how you are specifying the
"type" of each array ... somewhere here:

> eset<-exprs(newdata)
> predhcc<-matrix(“hcc”,1,15)
> predhcv<-matrix(“hcv”,,20)
> pred<-cbind(predhcc,predhcv)
> rownames(pred)<-"Type)
> data<-eset[1:2000,]
> data<rbind(data,pred)
> data<-t(data)

So, now you think you've got the data just how you want it.

But then I'd ask you to check:

(1) What "type" of thing is your `data` object? I guess it's still a
matrix. You will check like so:

R> is(data)

Does it say matrix?

(2) If it is a matrix, you should know the differences between a
matrix and a data.frame. They are both rectangular objects, right? So
you might ask yourself: why does R support both?

And the answer is that although both types of things are "row by
column" objects, every element in a matrix must be of the same type.
In a data.frame, it's only the columns of the data.frame that must
have all their elements to be of the same type. Each column can be of
different types though.

Think about that for a second.

Now let's go back to your code, specifically:

> data<- rbind(data,pred)

You are "rbind"-ing a numeric matrix with a character vector. What will happen?

Find out what the type of things your new `rbind`-ed data matrix holds ...

You've almost crossed the finish line now, so I'll leave you here so
that you can pull yourself over it.

But before you do that, please read up more on R basics so you can
more easily diagnose these things for yourself in the future.

Hope that was helpful, and good luck!


Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the Bioconductor mailing list