[R] [caret package] [trainControl] supplying predefined partitions to train with cross validation

Fabon Dzogang fabon.dzogang at lip6.fr
Fri May 6 12:32:04 CEST 2011


Hello,

Thank you for your reply but I'm not sure your code answers my needs,
from what I read it creates a 10-fold partition and then extracts the
kth partition for future processing.

My question was rather: once I have a 10-fold partition of my data,
how to supply it to the "train" function of the caret package. Here's
some sample code :

folds <- createFolds(my_dataset_classes, 10)

# I can't use index=folds on this one, it will train on the 1/k and test on k-1
t_control <- trainControl(method="cv", number=10)

# here I would like train to take account of my predefined folds
model <- train(my_dataset_predictors, my_dataset_classes,
method="svmLinear", trControl = t_control)

Cheers,
Fabon.

On Fri, May 6, 2011 at 10:59 AM, neetika nath <nikkihathi at gmail.com> wrote:
> Hi,
> I did the similar experiment with my data. may be following code will give
> you some idea. It might not be the best solution but for me it worked.
> please do share if you get other idea.
> Thank you
> #### CODE###
>
> library(dismo)
>
> set.seed(111)
>
> dd<-read.delim("yourfile.csv",sep=",",header=T)
>
> # To keep a check on error
>
> options(error=utils::recover)
>
> # dd- data to be split for 10 Fold CV, this will split complete data into 10
> fold
>
> number<-kfold(dd, k=10)
>
> case 1: if k ==1
>
> x<-NULL;
>
> #retrieve all the index (from your data) for 1st fold in x, such that you
> can use it as a test set and remaining can be used as train set for #1st
> iteration.
>
> x<-which(number==k)
>
> On Thu, May 5, 2011 at 11:43 PM, Fabon Dzogang <fabon.dzogang at lip6.fr>
> wrote:
>>
>> Hi all,
>>
>> I run R 2.11.1 under ubuntu 10.10 and caret version 2.88.
>>
>> I use the caret package to compare different models on a dataset. In
>> order to compare their different performances I would like to use the
>> same data partitions for every models. I understand that using a LGOCV
>> or a boot type re-sampling method along with the "index" argument of
>> the trainControl function, one is able to supply a training partition
>> to the train function.
>>
>> However, I would like to apply a 10-fold cross validation to validate
>> the models and I did not find any way to supply some predefined
>> partition (created with createFolds) in this setting. Any help ?
>>
>> Thank you and great package by the way !
>>
>> Fabon Dzogang.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Fabon Dzogang



More information about the R-help mailing list