[R] Random Forest, Giving More Importance to Some Data

Lorenzo Isella lorenzo.isella at gmail.com
Sun Mar 24 11:43:59 CET 2013


Dear All,
I am using randomForest to predict the final selling price of some items.
As it often happens, I have a lot of (noisy) historical data, but the  
question is not so much about data cleaning.
The dataset for which I need to carry out some predictions are fairly  
recent sales or even some sales that will took place in the near future.
As a consequence, historical data should be somehow weighted: the older  
they are, the less they should matter for the prediction.
Any idea about how this could be achieved?
Please find below a snippet showing how I use the randomForest library (on  
a multi-core machine).
Any suggestion is appreciated.
Cheers

Lorenzo

###########################################################################
rf_model <- foreach(iteration=1:cores,
                      ntree = rep(50, 4),
                      .combine = combine,
          .packages = "randomForest") %dopar%{
            sink("log.txt", append=TRUE)
            cat(paste("Starting iteration",iteration,"\n"))
            randomForest(trainRF,
            prices_train,   ## mtry=20,
                           nodesize=5,
                           ## maxnodes=140,
                          importance=FALSE, do.trace=10,ntree=ntree)
###########################################################################



More information about the R-help mailing list