[R] comparing classification methods: 10-fold cv or leaving-one-out ?
Tony Plate
tplate at acm.org
Tue Jan 6 17:31:37 CET 2004
I would recommend reading the following: Dietterich, T. G., (1998).
Approximate Statistical Tests for Comparing Supervised Classification
Learning Algorithms. Neural Computation, 10 (7) 1895-1924.
http://web.engr.oregonstate.edu/~tgd/publications/index.html
The issues in comparing methods are subtle and difficult. With such a
small data set I would be a little surprised if you could get any result
that are truly statistically significant, especially if your goal is to
compare among good non-linear methods (i.e., in which there are unlikely to
huge differences because of model misspecification). However, because the
issues are subtle, it is easy to get results that appear significant...
hope this helps,
Tony Plate
At Tuesday 04:31 PM 1/6/2004 +0100, Christoph Lehmann wrote:
>Hi
>what would you recommend to compare classification methods such as LDA,
>classification trees (rpart), bagging, SVM, etc:
>
>10-fold cv (as in Ripley p. 346f)
>
>or
>
>leaving-one-out (as e.g. implemented in LDA)?
>
>my data-set is not that huge (roughly 200 entries)
>
>many thanks for a hint
>
>Christoph
>--
>Christoph Lehmann <christoph.lehmann at gmx.ch>
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Tony Plate tplate at acm.org
More information about the R-help
mailing list