[R] Splitting data into test and train (80:20) kepping attributes similar

Frank Harrell f.harrell at vanderbilt.edu
Thu Apr 26 13:35:23 CEST 2012


You can run simulations to find out how large N must be so that split sample
validation yields sufficient precision to be trustworthy, in other words,
that different random splits provide the same estimate of model accuracy to
within some small tolerance.  You will be surprised how large N must be for
this to happen.  Consider resampling instead.
Frank

-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Splitting-data-into-test-and-train-80-20-kepping-attributes-similar-tp4583928p4589554.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list