[R] Trying something for fun...

Charles C. Berry cberry at tajo.ucsd.edu
Sun Aug 23 19:29:59 CEST 2009


On Sat, 22 Aug 2009, Noah Silverman wrote:

> And, of course that leads me to another question...
>
> With svm {e1071} I can ask the predict function to give me probabilities
> with lrm {Desigh} I can ask  the predict function to give me probabilities
>
> I can't see how to do this with clogit.
>
> Would someone be kind enough to explain the output options.  (I can see one 
> that is a probability option.)


Well, in the absence of bugs in predict.coxph you could do something like

 	fit <- clogit( winner ~ strata( heat ) + x )
 	new.preds <- predict( fit ,newdata=newdat, type = 'expected')

but this fails for survival_2.35-4. (IIRC, the maintainer knows this and 
there was recent correspondence here or on R-devel about this bug)

So you will have to work around this.

Something like

 	clogit.response <- function(x) Surv( I( rep(1, length(x)) ), x )
 	fit <- coxph( clogit.response( winner ) ~ strata( heat ) + x )
 	new.preds <- ave(
 			predict( fit ,newdata=newdat, type = 'risk'),
 			newdat$heat, FUN=prop.table )

Ought to do it.

HTH,

Chuck


>
> Thanks!!
>
> -N
>
>
> On 8/22/09 10:57 AM, Charles C. Berry wrote:
>>  On Fri, 21 Aug 2009, Noah Silverman wrote:
>> 
>> >  Hi,
>> > 
>> >  For fun, I'm trying to throw some horse racing data into either an svm 
>> >  or lrm model.  Curious to see what comes out as there are so many 
>> >  published papers on this.
>> > 
>> >  One thing I don't know how to do is to standardize the probabilities by 
>> >  race.
>> 
>>
>>  This sounds closer to the conditional logit model.
>>
>>  However, if I recall correctly there is an assumption that in the models
>>  of choice literature is stated something like 'independence of
>>  alternatives that are unavailable'.  That assumption might not hold in a
>>  horse race where the speed at which a horse runs may depend on what horses
>>  she is running against.
>>
>>  See
>>
>>      ?survival:::clogit
>>
>>  and
>> 
>> @article{mcfadden1974conditional,
>>    title={{Conditional logit analysis of qualitative choice behavior}},
>>    author={McFadden, D.},
>>    journal={Frontiers in econometrics},
>>    volume={8},
>>    pages={105--142},
>>    year={1974}
>> }
>> 
>>
>>  BTW, Professor McFadden has a quintessentially American biography:
>>
>>   http://nobelprize.org/nobel_prizes/economics/laureates/2000/mcfadden-autobio.html 
>> 
>>
>>  He mentions his personal background in farming and awards won for his
>>  'sheep and geese', but alas does not mention horses or racing.
>>
>>  HTH,
>>
>>  Chuck
>> 
>> > 
>> >  For example, if I train an LRM on a bunch of variable I get a model.  I 
>> >  can then get probability predictions from the model.  That works.
>> > 
>> >  It seems to me, that for a given race (8-12 horses) the probabilites of 
>> >  my predictions should sum to one.
>> > 
>> >  1) Is there some way to train the LRM to evaluate and then model the 
>> >  subsequent date "per race"??  (Perhaps indicate some kind of grouping 
>> >  variable?
>> > 
>> >  2) Alternately, if I just run my data through a "standard" LRM, is there 
>> >  some way to then "normalize" the probabilities in a correct way for each 
>> >  upcoming race?
>> > 
>> >  I've done some extensive research in this area and would be willing to 
>> >  discuss more details offline with someone if they could contribute to 
>> >  the process.
>> > 
>> >  Thanks!!
>> > 
>> >  -N
>> > 
>> >  ______________________________________________
>> >  R-help at r-project.org mailing list
>> >  https://stat.ethz.ch/mailman/listinfo/r-help
>> >  PLEASE do read the posting guide 
>> >  http://www.R-project.org/posting-guide.html
>> >  and provide commented, minimal, self-contained, reproducible code.
>> > 
>> > 
>>
>>  Charles C. Berry                            (858) 534-2098
>>                                              Dept of Family/Preventive
>>  Medicine
>>  E mailto:cberry at tajo.ucsd.edu                UC San Diego
>>  http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
>> 
>> 
>
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901




More information about the R-help mailing list