[R] Error: Can not handle categorical predictors with more than 32 categories.

Uwe Ligges ligges at statistik.uni-dortmund.de
Wed Mar 23 08:32:25 CET 2005


Melanie Vida wrote:

> Hi All,
> 
> My question is in regards to an error generated when using randomForest 
> in R. Is there a special way to format the data in order to avoid this 
> error, or am I completely confused on what the error implies?
> 
> "Error in randomForest.default(m, y, ...) :
>        Can not handle categorical predictors with more than 32 categories."
> 
> This is generated from the command line:
>  > credit.rf <- randomForest(V16 ~ ., data=credit, mtry=2, importance = 
> TRUE, do.trace=100)
> 
> The data set is the credit-screening data from the UCI respository, 
> ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-screening/crx.data. 
> This data consists of  690 samples and 16 attributes.
> The attribute information includes:
> 
> A1:    b, a.
>    A2:    continuous.
>    A3:    continuous.
>    A4:    u, y, l, t.
>    A5:    g, p, gg.
>    A6:    c, d, cc, i, j, k, m, r, q, w, x, e, aa, ff.
>    A7:    v, h, bb, j, n, z, dd, ff, o.
>    A8:    continuous.
>    A9:    t, f.
>    A10:    t, f.
>    A11:    continuous.
>    A12:    t, f.
>    A13:    g, p, s.
>    A14:    continuous.
>    A15:    continuous.
>    A16: +,-         (class attribute)
>
> Has anyone tried randomForests in R on the credit-screening data set 
> from the UCI repository?


For sure you forgot to set  na.strings = "?"  in read.table()....
Look at str(credit) to see that some numerics had been converted to 
factors for that reason.

Uwe Ligges



> Thanks in advance for any useful hints and tips,
> 
> Melanie
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html




More information about the R-help mailing list