[R] Trying something for fun...

Sat Aug 22 20:06:23 CEST 2009

Thanks Charles,

I'll have a look at the conditinoal logit function

One question: My strata is the "race" (Actually concatonation of  date 
and number)  So the actual "values" used in the training set are 
different than the test set.  Will that matter?  (In other words, when 
training a clogit is the exact value of the strata saved as part of the 
model, or is it just used for grouping?)

On 8/22/09 10:57 AM, Charles C. Berry wrote:
> On Fri, 21 Aug 2009, Noah Silverman wrote:
>
>> Hi,
>>
>> For fun, I'm trying to throw some horse racing data into either an 
>> svm or lrm model.  Curious to see what comes out as there are so many 
>> published papers on this.
>>
>> One thing I don't know how to do is to standardize the probabilities 
>> by race.
>
>
> This sounds closer to the conditional logit model.
>
> However, if I recall correctly there is an assumption that in the 
> models of choice literature is stated something like 'independence of 
> alternatives that are unavailable'.  That assumption might not hold in 
> a horse race where the speed at which a horse runs may depend on what 
> horses she is running against.
>
> See
>
>     ?survival:::clogit
>
> and
>
> @article{mcfadden1974conditional,
>   title={{Conditional logit analysis of qualitative choice behavior}},
>   author={McFadden, D.},
>   journal={Frontiers in econometrics},
>   volume={8},
>   pages={105--142},
>   year={1974}
> }
>
>
> BTW, Professor McFadden has a quintessentially American biography:
>
>  http://nobelprize.org/nobel_prizes/economics/laureates/2000/mcfadden-autobio.html 
>
>
> He mentions his personal background in farming and awards won for his 
> 'sheep and geese', but alas does not mention horses or racing.
>
> HTH,
>
> Chuck
>
>>
>> For example, if I train an LRM on a bunch of variable I get a model.  
>> I can then get probability predictions from the model.  That works.
>>
>> It seems to me, that for a given race (8-12 horses) the probabilites 
>> of my predictions should sum to one.
>>
>> 1) Is there some way to train the LRM to evaluate and then model the 
>> subsequent date "per race"??  (Perhaps indicate some kind of grouping 
>> variable?
>>
>> 2) Alternately, if I just run my data through a "standard" LRM, is 
>> there some way to then "normalize" the probabilities in a correct way 
>> for each upcoming race?
>>
>> I've done some extensive research in this area and would be willing 
>> to discuss more details offline with someone if they could contribute 
>> to the process.
>>
>> Thanks!!
>>
>> -N
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> Charles C. Berry                            (858) 534-2098
>                                             Dept of Family/Preventive 
> Medicine
> E mailto:cberry at tajo.ucsd.edu                UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 
> 92093-0901
>
>