[R] Inconsistent results between caret+kernlab versions

Fri Nov 15 01:31:40 CET 2013

I'm using caret to assess classifier performance (and it's great!). However, I've found that my results differ between R2.* and R3.* - reported accuracies are reduced dramatically. I suspect that a code change to kernlab ksvm may be responsible (see version 5.16-24 here: http://cran.r-project.org/web/packages/caret/news.html). I get very different results between caret_5.15-61 + kernlab_0.9-17 and caret_5.17-7 + kernlab_0.9-19 (see below).

Can anyone please shed any light on this?

Thanks very much!

### To replicate:

require(repmis)  # For downloading from https
df <- source_data('https://dl.dropboxusercontent.com/u/47973221/data.csv', sep=',')
require(caret)
svm.m1 <- train(df[,-1],df[,1],method='svmRadial',metric='Kappa',tunelength=5,trControl=trainControl(method='repeatedcv', number=10, repeats=10, classProbs=TRUE))
svm.m1
sessionInfo()

### Results - R2.15.2

> svm.m1
1241 samples
   7 predictors
  10 classes: ‘O27479’, ‘O31403’, ‘O32057’, ‘O32059’, ‘O32060’, ‘O32078’, ‘O32089’, ‘O32663’, ‘O32668’, ‘O32676’ 

No pre-processing
Resampling: Cross-Validation (10 fold, repeated 10 times) 

Summary of sample sizes: 1116, 1116, 1114, 1118, 1118, 1119, ... 

Resampling results across tuning parameters:

  C     Accuracy  Kappa  Accuracy SD  Kappa SD
  0.25  0.684     0.63   0.0353       0.0416  
  0.5   0.729     0.685  0.0379       0.0445  
  1     0.756     0.716  0.0357       0.0418  

Tuning parameter ‘sigma’ was held constant at a value of 0.247
Kappa was used to select the optimal model using  the largest value.
The final values used for the model were C = 1 and sigma = 0.247. 
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] e1071_1.6-1     class_7.3-5     kernlab_0.9-17  repmis_0.2.4    caret_5.15-61   reshape2_1.2.2  plyr_1.8        lattice_0.20-10 foreach_1.4.0   cluster_1.14.3 

loaded via a namespace (and not attached):
 [1] codetools_0.2-8 compiler_2.15.2 digest_0.6.0    evaluate_0.4.3  formatR_0.7     grid_2.15.2     httr_0.2        iterators_1.0.6 knitr_1.1       RCurl_1.95-4.1  stringr_0.6.2   tools_2.15.2  

### Results - R3.0.2

> require(caret)
> svm.m1 <- train(df[,-1],df[,1],method=’svmRadial’,metric=’Kappa’,tunelength=5,trControl=trainControl(method=’repeatedcv’, number=10, repeats=10, classProbs=TRUE))
Loading required package: class
Warning messages:
1: closing unused connection 4 (https://dl.dropboxusercontent.com/u/47973221/df.Rdata) 
2: executing %dopar% sequentially: no parallel backend registered 
> svm.m1
1241 samples
   7 predictors
  10 classes: ‘O27479’, ‘O31403’, ‘O32057’, ‘O32059’, ‘O32060’, ‘O32078’, ‘O32089’, ‘O32663’, ‘O32668’, ‘O32676’ 

No pre-processing
Resampling: Cross-Validation (10 fold, repeated 10 times) 

Summary of sample sizes: 1118, 1117, 1115, 1117, 1116, 1118, ... 

Resampling results across tuning parameters:

  C     Accuracy  Kappa  Accuracy SD  Kappa SD
  0.25  0.372     0.278  0.033        0.0371  
  0.5   0.39      0.297  0.0317       0.0358  
  1     0.399     0.307  0.0289       0.0323  

Tuning parameter ‘sigma’ was held constant at a value of 0.2148907
Kappa was used to select the optimal model using  the largest value.
The final values used for the model were C = 1 and sigma = 0.215. 
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] e1071_1.6-1     class_7.3-9     kernlab_0.9-19  repmis_0.2.6.2  caret_5.17-7    reshape2_1.2.2  plyr_1.8        lattice_0.20-24 foreach_1.4.1   cluster_1.14.4 

loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_3.0.2  digest_0.6.3    grid_3.0.2      httr_0.2        iterators_1.0.6 RCurl_1.95-4.1  stringr_0.6.2   tools_3.0.2