[R] randomForest.error: length of response must be the same as predictors
Gavin Simpson
gavin.simpson at ucl.ac.uk
Thu Jul 3 10:50:22 CEST 2008
On Thu, 2008-07-03 at 12:11 +0530, Soumyadeep Nandi wrote:
> My data looks like:
> A,B,C,D,Class
> 1,2,0,2,cl1
> 1,5,1,9,cl1
> 3,2,1,2,cl2
> 7,2,1,2,cl2
> 2,2,1,2,cl2
> 1,2,1,5,cl2
> 0,2,1,2,cl2
> 4,2,1,2,cl2
> 3,5,1,2,cl2
> 3,2,12,3,cl2
> 3,2,4,2,cl2
>
> **The steps followed are:
> trainfile <- read.csv("TrainFile",head=TRUE)
> datatrain <- subset(trainfile,select=c(-Class))
> classtrain <- (subset(trainfile,select=Class))
> rf <- randomForest(datatrain, classtrain)
>
> Error in randomForest.default(classtrain, datatrain) :
> length of response must be the same as predictors
> In addition: Warning message:
> In randomForest.default(classtrain, datatrain) :
> The response has five or fewer unique values. Are you sure you want to do
> regression?
>
> Could someone suggest me where I am going wrong.
Yep, look at class(classtrain):
> class(classtrain)
[1] "data.frame"
subset() returns a data.frame, which is a special case of a list. The
lengths of a list (and therefore a data frame) are not what you expect:
> length(classtrain)
[1] 1
There is *1* component to the list, one '$' bit that you can get at.
Hence, rf complains as, to it, the length of x and y are not the same,
when evaluated using length().
Note that ?randomForest does state that y should be a response 'vector',
so you are not supplying what is required.
Two ways to proceed:
rf <- randomForest(Class ~ ., data = trainfile)
or if you really don't want the formula parsing, force the empty
dimension to be dropped, by subsetting:
rf <- randomForest(datatrain, classtrain[,1])
[Nb, as classtrain is of class "data.frame", drop() will not work on it
as it doesn't have a dim attribute]
HTH
G
>
> Thanks
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list