[R] problem with formula argument to randomForest

Ed Komp komp at ittc.ku.edu
Wed Oct 28 15:25:49 CET 2015



The randomForest function generates an error whenever
I supply it with a formula using the function, I() to inhibit interpretation.
When I do so, I always get an error like this one:
     Error in unique(c("AsIs", oldClass(x))) : object 'Age' not found

Is this because of:
1.  a restriction for the randomForest function that I have not seen documented;
2.  a deficiency / error in randomForest; or
3.  an error in my calling sequence?

I am including a very simple example to demonstrate the problem.
Simply using   I(<colname>)  generates the error.
This is not a meaningful use of I(), but is very simple.
My Interest is for  I( <col1> / <col2>) .

I also demonstrate that the usage of I() in a formula works just fine
for another discrimination function, lda.

The sample code is included after my signature, along with line-by-line output.

Thanks in advance !

Ed Komp
ITTC Lab, University of Kansas

                                        ===============
> library(rpart)
> library(MASS)
> library(randomForest)
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
> formula <- as.formula('Kyphosis ~ Age + Number + Start')
> formula
Kyphosis ~ Age + Number + Start
> formulaWithI <- as.formula('Kyphosis ~ I(Age) + Number + Start')
> formulaWithI
Kyphosis ~ I(Age) + Number + Start
> fit <- randomForest(formula,   data=kyphosis)
> fitWithI <- randomForest(formulaWithI,   data=kyphosis)
Error in unique(c("AsIs", oldClass(x))) : object 'Age' not found
>
> fit <- lda(formula, data = kyphosis)
> fitWithI <- lda(formula, data = kyphosis)
> fitWithI
Call:
lda(formula, data = kyphosis)

Prior probabilities of groups:
   absent   present
0.7901235 0.2098765

Group means:
             Age   Number     Start
absent  79.89062 3.750000 12.609375
present 97.82353 5.176471  7.294118

Coefficients of linear discriminants:
                LD1
Age     0.005910971
Number  0.291501797
Start  -0.170496626
>
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] randomForest_4.6-12 MASS_7.3-44         rpart_4.1-10



More information about the R-help mailing list