[R] machine learning and horse racing

Gerard Smits g_smits at verizon.net
Tue Sep 18 23:03:43 CEST 2007


Hi Stephen,

Not responding to the R memory question, but to the racing.

I worked on this many years ago and found no way of overcoming the 
19% or so paramutual take.  That being said, I suggest you take class 
into account (based on purse, type of race (maiden claiming, claiming 
$, NWxx allowance, etc).    Make sure that you are accounting for the 
size of the field.  it is much easier to win a race of 6 than 12 
horses.  A similar bias applies to the advantage of inner post 
position, if you do not account for number of entries.

Re validation, I would not build a mode on X years of data and then 
validate.  Patterns change and a model needs to be adaptive. I would 
use a hold out day, per week (randomly chosen) and then use that.

good luck in a difficult task.

Gerard



More information about the R-help mailing list