[R] Grouped Logistic (Or conditional Logistic.)
Noah Silverman
noah at smartmediacorp.com
Thu Sep 17 20:36:23 CEST 2009
Ted,
Thanks for the reply.
For the example, I'm not looking to predict "THE winner", but to find
the best probabilities of winning.
It would seem that the process of iterating through possible
coefficients would be the same as a standard GLM, the "evalation" part
as you work through them would have to be adjusted to look "per group".
I would call this something like "grouped maximum liklihood" if I got
to make up the name.
-N
On 9/17/09 11:06 AM, (Ted Harding) wrote:
> On 17-Sep-09 17:28:16, Noah Silverman wrote:
>
>> Hi,
>> I'm not sure of the correct nomenclature or function for what
>> I'm trying to do.
>>
>> I'm interested in calculated a logistic regression on a binary
>> dependent variable (True,False).
>>
>> There are a few ways to easily do this in R. Both SVM and GLM
>> work easily.
>>
>> The part that I want to add is "group wise" awareness. So that
>> the algorithm computes the coefficients to maximize the liklihood
>> of of a "True" label per group.
>>
>> An toy explanation is probably best. I've been looking at horse
>> racing models as a fun field to learn about statistics and R.
>>
>> So, for this example, lets assume the following:
>> 100 horses in our stable
>> 10 horses per race
>> 75 races this season (some horses race more than once.)
>>
>> The independent variables are things about a horse (average speed,
>> number of past wins, etc.)
>> The dependent variable is (Win, Lose) represented by (1,0)
>>
>> As mentioned above, an SVM or GLM will quickly work to estimate
>> coefficients and probability of a Win. I'd like to take it further
>> and estimate the probability of a win but look at the per race.
>>
>> I'm NOT interested in the group label as a final part of the model.
>> I don't want a separate set of coefficients for each group. I just
>> want the iterative algorithm to work toward maximizing the liklihood
>> PER GROUP as an average.
>>
>> I looked extensively through rseek.org for things like "grouped
>> logistic" and "nested logistic". I couldn't seem to find anything
>> do this. I'm probably naming it wrong.
>>
>> I assume that a MANUAL iteration concept would be to :
>> 1) Pick a coefficient
>> 2) Calculate the resulting probability for each horse.
>> 3) Measure the strength of the result for each race (sum them
>> together or average them?)
>> 4) Adjust coefficient and repeat
>>
>> Surely there must be some standard function in a library that will
>> do this.
>>
>> Can any of the stat gurus here offer some suggestions?
>>
>> Thanks!
>> --
>> Noah
>>
> In the context of your "fun example", you have a fundamental problem
> in that (if I've understood your statement of it correctly) you will
> have more than one of your horses in the same race (apparently 10).
>
> Therefore, one of them winning excludes any of the others winning in
> that same race, so their results are not independent of each
> other.
>
> Also, at least in real life, the probability that a given horse will
> win in a particular race depends not only on the covariates "per horse"
> (such as your average speed, number of past wins, etc.), and indeed
> on the condition of the race-course at the time, but also (and usually
> strongly) on the characteristics of the other horses in the same race.
>
> So a simple logistic model of the kind you seem to be proposing would
> certainly not be realistic!
>
> I would be happier thinking about your problem in the context of a
> different kind of example ...
>
> Ted.
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding)<Ted.Harding at manchester.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 17-Sep-09 Time: 19:06:27
> ------------------------------ XFMail ------------------------------
>
More information about the R-help
mailing list