[R] test cases (data) for data based modeling

Immanuel B mane.desk at googlemail.com
Mon Jun 6 17:12:50 CEST 2011


Hello all,

I'm working mostly with machine learning code in R and looking for a structured
way to check if my code is working properly.

For example if I train a classifier on some data. How do I know if the
good / bad results
are related to the data are not just an programming error that I
introduced somewhere.

results are to good: I might have used some part of the test data for training
results are to bad: could have any reason

I know that I can in principle generate data containing no information
at all or pure information to benchmark
my code but is there a more elaborate or easyer way to that?

I guess what I'm basically looking for is some kind of unit testing
framework to generate test data
for machine learning tasks, I read about the package RUnit but don't
really know how to proceed from
there.

Any ideas?
How do you test your data analysis code?

best regards,
Immanuel



More information about the R-help mailing list