[R] Grouped Logistic (Or conditional Logistic.)
Noah Silverman
noah at smartmediacorp.com
Thu Sep 17 19:28:16 CEST 2009
Hi,
I'm not sure of the correct nomenclature or function for what I'm trying
to do.
I'm interested in calculated a logistic regression on a binary dependent
variable (True,False).
There are a few ways to easily do this in R. Both SVM and GLM work easily.
The part that I want to add is "group wise" awareness. So that the
algorithm computes the coefficients to maximize the liklihood of of a
"True" label per group.
An toy explanation is probably best. I've been looking at horse racing
models as a fun field to learn about statistics and R.
So, for this example, lets assume the following:
100 horses in our stable
10 horses per race
75 races this season (some horses race more than once.)
The independent variables are things about a horse (average speed,
number of past wins, etc.)
The dependent variable is (Win, Lose) represented by (1,0)
As mentioned above, an SVM or GLM will quickly work to estimate
coefficients and probability of a Win. I'd like to take it further and
estimate the probability of a win but look at the per race.
I'm NOT interested in the group label as a final part of the model. I
don't want a separate set of coefficients for each group. I just want
the iterative algorithm to work toward maximizing the liklihood PER
GROUP as an average.
I looked extensively through rseek.org for things like "grouped
logistic" and "nested logistic". I couldn't seem to find anything do
this. I'm probably naming it wrong.
I assume that a MANUAL iteration concept would be to :
1) Pick a coefficient
2) Calculate the resulting probability for each horse.
3) Measure the strength of the result for each race (sum them
together or average them?)
4) Adjust coefficient and repeat
Surely there must be some standard function in a library that will do this.
Can any of the stat gurus here offer some suggestions?
Thanks!
--
Noah
More information about the R-help
mailing list