[R] difference between createPartition and createfold functions
bby2103 at columbia.edu
bby2103 at columbia.edu
Sun Oct 2 21:54:20 CEST 2011
Hi Steve,
Thanks for the note. I did try the example and the result didn't make
sense to me. For splitting a vector, what you describe is a big
difference btw them. For splitting a dataframe, I now wonder if these
2 functions are the wrong choices. They seem to split the columns, at
least in the few things I tried.
Bonnie
Quoting Steve Lianoglou <mailinglist.honeypot at gmail.com>:
> Hi,
>
> On Sun, Oct 2, 2011 at 2:47 PM, <bby2103 at columbia.edu> wrote:
>> Hello,
>>
>> I'm trying to separate my dataset into 4 parts with the 4th one as the test
>> dataset, and the other three to fit a model.
>>
>> I've been searching for the difference between these 2 functions in Caret
>> package, but the most I can get is this--
>>
>> A series of test/training partitions are created using createDataPartition
>> while createResample creates one or more bootstrap samples. createFolds
>> splits the data into k groups.
>>
>> I'm missing something here? What is the difference btw createPartition and
>> createFold? I guess they wouldn't be equivalent.
>
> Well -- you could always look at the source code to find out (enter
> the name of the function into your R console and hit return), but you
> can also do some experimentation to find out. Using the data from the
> Examples section of caret::createFolds:
>
> R> library(caret)
> R> data(oil)
> R> part <- createDataPartition(oilType, 2)
> R> fold <- createFolds(oilType, 2)
>
> R> length(Reduce(intersect, part))
> [1] 27
>
> R> length(Reduce(intersect, fold))
> [1] 0
>
> Looks like `createDataPartition` split your data into smaller pieces,
> but allows for the same example to appear in different splits.
>
> `createFolds` doesn't allow different examples to appear in different
> splits of the folds.
>
> HTH,
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
>
More information about the R-help
mailing list