[R] Selecting A List of Columns
Sparks, John James
jspark4 at uic.edu
Fri May 17 08:51:56 CEST 2013
Dear R Helpers,
I need help with a slightly unusual situation in which I am trying to
select some columns from a data frame. I know how to use the subset
statement with column names as in:
x=as.data.frame(matrix(c(1,2,3,
1,2,3,
1,2,2,
1,2,2,
1,1,1),ncol=3,byrow=T))
all.cols<-colnames(x)
to.keep<-all.cols[1:2]
Kept<-subset(x,select=to.keep)
Kept
However, if I want to select some columns based on a selection of the most
important variables from a random forest then I find myself stuck. The
example below demonstrates the problem.
library(randomForest)
data(mtcars)
mtcars.rf <- randomForest(mpg ~ ., data=mtcars,importance=TRUE)
Importance<-data.frame(mtcars.rf$importance)
Importance
MSEImportance<-head(Importance[order(Importance$X.IncMSE,
decreasing=TRUE),],3)
MSEVars<-row.names(MSEImportance)
MSEVars<-data.frame(MSEVars,stringsAsFactors = FALSE)
colnames(MSEVars)<-"Vars"
NodeImportance<-head(Importance[order(Importance$IncNodePurity,decreasing=TRUE),],
3)
NodeVars<-row.names(NodeImportance)
NodeVars<-data.frame(NodeVars,stringsAsFactors = FALSE)
colnames(NodeVars)<-"Vars"
ImportantVars<-rbind(MSEVars,NodeVars)
ImportantVars<-unique(ImportantVars)
nrow(ImportantVars)
ImportantVars<-as.character(ImportantVars)
ImportantVars
CarsVarsKept<-subset(mtcars,select=ImportantVars)
Error in `[.data.frame`(x, r, vars, drop = drop) :
undefined columns selected
Any help on how to select these columns from the data frame would be most
appreciated.
--John J. Sparks, Ph.D.
More information about the R-help
mailing list