[R] how is the model resample performance calculated by caret?

Max Kuhn mxkuhn at gmail.com
Fri Feb 28 19:11:02 CET 2014


On Fri, Feb 28, 2014 at 1:13 AM, zhenjiang zech xu
<zhenjiang.xu at gmail.com> wrote:
> Dear all,
>
> I did a 5-repeat of 10-fold cross validation using partial least square
> regression model provided by caret package. Can anyone tell me how are the
> values in plsTune$resample calculated? Is that predicted on each hold-out
> set using the model which is trained on the rest data with the optimized
> parameter tuned from previous cross validation?

Yes, those values are the performance estimates across each hold-out
using the final model. There is an option in trainControl() that will
have it return the resamples from all models too.

> So in the following
> example, firstly, 5-repeat of 10-fold cross validation gives 2 for ncomp as
> the best, and then using ncomp of 2 and the training data to build a model
> and then predict the hold-out data with the model to give a RMSE and
> RSQUARE - is what I am thinking true?

It is.

Max

>
>
>> plsTune
> 524 samples
> 615 predictors
>
> Pre-processing: centered, scaled
> Resampling: Cross-Validation (10 fold, repeated 5 times)
>
> Summary of sample sizes: 472, 472, 471, 471, 471, 471, ...
>
> Resampling results across tuning parameters:
>
>   ncomp  RMSE  Rsquared  RMSE SD  Rsquared SD
>   1      16.8  0.434     1.47     0.0616
>   2      14.3  0.612     2.21     0.0768
>   3      13.5  0.704     6.33     0.145
>   4      14.6  0.706     9.29     0.163
>   5      15.2  0.703     10.9     0.172
>   6      16.5  0.69      13.4     0.181
>   7      18.4  0.672     17.8     0.194
>   8      20    0.651     20.4     0.199
>   9      20.9  0.634     20.9     0.199
>   10     22.1  0.613     22.1     0.197
>   11     23.3  0.599     23.8     0.198
>   12     24    0.588     24.7     0.198
>   13     24.9  0.572     25.2     0.197
>   14     25.8  0.557     26.2     0.194
>   15     26.2  0.544     25.8     0.191
>   16     26.6  0.532     25.5     0.187
>
> RMSE was used to select the optimal model using  the one SE rule.
> The final value used for the model was ncomp = 2.
>>
>> plsTune$resample
>    ncomp     RMSE  Rsquared    Resample
> 1      2 13.61569 0.6349700 Fold06.Rep4
> 2      2 16.02091 0.5808985 Fold05.Rep1
> 3      2 12.59985 0.6008357 Fold03.Rep5
> 4      2 13.20069 0.6296245 Fold02.Rep3
> 5      2 12.43419 0.6560434 Fold04.Rep2
> 6      2 15.36510 0.5954177 Fold04.Rep5
> 7      2 12.70028 0.6894489 Fold03.Rep2
> 8      2 13.34882 0.6468300 Fold09.Rep3
> 9      2 14.80217 0.5575010 Fold08.Rep3
> 10     2 19.03705 0.4907630 Fold05.Rep4
> 11     2 14.26704 0.6579390 Fold10.Rep2
> 12     2 13.79060 0.5806663 Fold05.Rep3
> 13     2 14.83641 0.5918039 Fold05.Rep2
> 14     2 12.48721 0.7011439 Fold01.Rep3
> 15     2 14.98765 0.5866102 Fold07.Rep4
> 16     2 10.88100 0.7597167 Fold06.Rep1
> 17     2 13.60705 0.6321377 Fold08.Rep5
> 18     2 13.42618 0.6136031 Fold08.Rep4
> 19     2 13.26066 0.6784586 Fold07.Rep1
> 20     2 13.20623 0.6812341 Fold03.Rep3
> 21     2 18.54275 0.4404729 Fold08.Rep2
> 22     2 11.80312 0.7177681 Fold05.Rep5
> 23     2 18.56271 0.4661072 Fold03.Rep1
> 24     2 13.54879 0.5850439 Fold10.Rep3
> 25     2 14.10859 0.5994811 Fold06.Rep5
> 26     2 13.68329 0.6701091 Fold01.Rep5
> 27     2 16.12123 0.5401200 Fold10.Rep1
> 28     2 12.92250 0.6917220 Fold06.Rep3
> 29     2 12.94366 0.6400066 Fold06.Rep2
> 30     2 12.39889 0.6790578 Fold01.Rep2
> 31     2 13.48499 0.6759649 Fold01.Rep1
> 32     2 12.52938 0.6728476 Fold03.Rep4
> 33     2 16.43352 0.5795160 Fold09.Rep5
> 34     2 12.53991 0.6550694 Fold09.Rep4
> 35     2 12.78708 0.6304606 Fold08.Rep1
> 36     2 13.97559 0.6655688 Fold04.Rep3
> 37     2 15.31642 0.5124997 Fold09.Rep2
> 38     2 15.24194 0.5324943 Fold09.Rep1
> 39     2 12.90107 0.6318960 Fold04.Rep1
> 40     2 13.59574 0.6277869 Fold01.Rep4
> 41     2 19.73633 0.4154821 Fold07.Rep5
> 42     2 12.03759 0.6537381 Fold02.Rep5
> 43     2 15.47139 0.5597097 Fold02.Rep4
> 44     2 22.55060 0.3816672 Fold07.Rep3
> 45     2 14.57875 0.6269560 Fold07.Rep2
> 46     2 13.02385 0.6395148 Fold02.Rep2
> 47     2 13.81020 0.6116137 Fold02.Rep1
> 48     2 13.46100 0.6200828 Fold04.Rep4
> 49     2 13.95487 0.6709253 Fold10.Rep5
> 50     2 12.65981 0.6606435 Fold10.Rep4
>
> Best,
> Zhenjiang
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list