[R] Feature selection using R Caret package: Error in seeds[[num_rs + 1L]] : subscript out of bounds

Its August the_august at yahoo.com
Tue Dec 20 22:30:25 CET 2016


Hello All,
I've a dataset of six samples and 1530 variables/features and wish to know the the importance of features. I'm trying to use the "Rank Features By Importance" as mentioned in Feature Selection with the Caret R Package (http://machinelearningmastery.com/feature-selection-with-the-caret-r-package/)
I'm using the following code:
    rm(list=ls())    set.seed(12345)    library(mlbench)    library(caret)    options(error=utils::recover)
    #Pastebin link for Data: http://pastebin.com/raw/cg0Kiueq    mydata.df <- read.table("data.PasteBin.txt", header=TRUE,sep="\t",stringsAsFactors=TRUE)    dim(mydata.df)
    lvq.control <- trainControl(method="LOOCV")    lvq.model <- train(ID~., data=mydata.df, method="lvq", trControl=lvq.control ) #FAILS
    importance <- varImp(lvq.model, scale=FALSE)    print(importance)    plot(importance)
The data can be downloaded from the following Pastebin link:
http://pastebin.com/raw/cg0Kiueq

The program fails to execute with the following error and debug messages:
    Error in seeds[[num_rs + 1L]] : subscript out of bounds    1: train(ID ~ ., data = mydata.df, method = "lvq", trControl = lvq.control)    2: train.formula(ID ~ ., data = mydata.df, method = "lvq", trControl = lvq.con    3: train(x, y, weights = w, ...)    4: train.default(x, y, weights = w, ...)

I've read from multiple sources (http://davidhughjones.blogspot.com/2015/04/r-tip-caret-error.html) that unless the response variable is of class factor Caret issues error like this.
 However, my response variable('ID') is indeed a factor
    > str(mydata.df$ID)     Factor w/ 2 levels "NONRC","RC": 2 2 1 1 2 1
The detail of my version of R and Caret are as follows:
    > packageVersion("caret")    [1] ‘6.0.70’    R version 3.3.0 (2016-05-03)    Platform: x86_64-w64-mingw32/x64 (64-bit)    Running under: Windows 7 x64 (build 7601) Service Pack 1
Can someone please suggest any remedy?
Thanks in advance
	[[alternative HTML version deleted]]



More information about the R-help mailing list