[R] Trying something for fun...
Charles C. Berry
cberry at tajo.ucsd.edu
Sun Aug 23 19:43:58 CEST 2009
On Sun, 23 Aug 2009, Charles C. Berry wrote:
> On Sat, 22 Aug 2009, Noah Silverman wrote:
>
>> And, of course that leads me to another question...
>>
>> With svm {e1071} I can ask the predict function to give me probabilities
>> with lrm {Desigh} I can ask the predict function to give me probabilities
>>
>> I can't see how to do this with clogit.
>>
>> Would someone be kind enough to explain the output options. (I can see
>> one that is a probability option.)
>
>
> Well, in the absence of bugs in predict.coxph you could do something like
>
> fit <- clogit( winner ~ strata( heat ) + x )
> new.preds <- predict( fit ,newdata=newdat, type = 'expected')
>
> but this fails for survival_2.35-4. (IIRC, the maintainer knows this and
> there was recent correspondence here or on R-devel about this bug)
>
> So you will have to work around this.
>
> Something like
>
> clogit.response <- function(x) Surv( I( rep(1, length(x)) ), x )
> fit <- coxph( clogit.response( winner ) ~ strata( heat ) + x )
> new.preds <- ave(
> predict( fit ,newdata=newdat, type = 'risk'),
> newdat$heat, FUN=prop.table )
>
> Ought to do it.
Oops.
Forgot to mention that you need a dummy placeholder for 'winner' in the
newdat data.frame. Something like
newdat$winner <- newdat$heat
should fix it.
>
> HTH,
>
> Chuck
>
>
>>
>> Thanks!!
>>
>> -N
>>
>>
>> On 8/22/09 10:57 AM, Charles C. Berry wrote:
>> > On Fri, 21 Aug 2009, Noah Silverman wrote:
>> >
>> > > Hi,
>> > >
>> > > For fun, I'm trying to throw some horse racing data into either an
>> > > svm or lrm model. Curious to see what comes out as there are so many
>> > > published papers on this.
>> > >
>> > > One thing I don't know how to do is to standardize the probabilities
>> > > by race.
>> >
>> >
>> > This sounds closer to the conditional logit model.
>> >
>> > However, if I recall correctly there is an assumption that in the
>> > models
>> > of choice literature is stated something like 'independence of
>> > alternatives that are unavailable'. That assumption might not hold in
>> > a
>> > horse race where the speed at which a horse runs may depend on what
>> > horses
>> > she is running against.
>> >
>> > See
>> >
>> > ?survival:::clogit
>> >
>> > and
>> >
>>> @article{mcfadden1974conditional,
>> > title={{Conditional logit analysis of qualitative choice behavior}},
>> > author={McFadden, D.},
>> > journal={Frontiers in econometrics},
>> > volume={8},
>> > pages={105--142},
>> > year={1974}
>>> }
>> >
>> >
>> > BTW, Professor McFadden has a quintessentially American biography:
>> >
>> > http://nobelprize.org/nobel_prizes/economics/laureates/2000/mcfadden-autobio.html
>> >
>> >
>> > He mentions his personal background in farming and awards won for his
>> > 'sheep and geese', but alas does not mention horses or racing.
>> >
>> > HTH,
>> >
>> > Chuck
>> >
>> > >
>> > > For example, if I train an LRM on a bunch of variable I get a model.
>> > > I can then get probability predictions from the model. That works.
>> > >
>> > > It seems to me, that for a given race (8-12 horses) the probabilites
>> > > of my predictions should sum to one.
>> > >
>> > > 1) Is there some way to train the LRM to evaluate and then model the
>> > > subsequent date "per race"?? (Perhaps indicate some kind of grouping
>> > > variable?
>> > >
>> > > 2) Alternately, if I just run my data through a "standard" LRM, is
>> > > there some way to then "normalize" the probabilities in a correct way
>> > > for each upcoming race?
>> > >
>> > > I've done some extensive research in this area and would be willing
>> > > to discuss more details offline with someone if they could contribute
>> > > to the process.
>> > >
>> > > Thanks!!
>> > >
>> > > -N
>> > >
>> > > ______________________________________________
>> > > R-help at r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> > >
>> > >
>> >
>> > Charles C. Berry (858) 534-2098
>> > Dept of Family/Preventive
>> > Medicine
>> > E mailto:cberry at tajo.ucsd.edu UC San Diego
>> > http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego
>> > 92093-0901
>> >
>> >
>>
>>
>
> Charles C. Berry (858) 534-2098
> Dept of Family/Preventive
> Medicine
> E mailto:cberry at tajo.ucsd.edu UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list