# [R] Logistic Regression

David Winsemius dwinsemius at comcast.net
Wed Aug 5 01:03:52 CEST 2009

```On Aug 4, 2009, at 6:52 PM, Noah Silverman wrote:

> Hmmm..  I'll try that.
>
> I recall reading somewhere that the "group" variable had to be
> indicated in a special way.
>

I use lrm and cph all the time ... thank you, Frank Harrell....  and I
can assure you that a factor variable or one that has a small number
of values causes no problems in estimation. Of course, it helps to
have a statistical background so that you know what the output means
and can check the predictions against data. In the formula below
(assuming v1 and v2 are continuous) the parameter estimates would be
the incremental log odds of an event for a group value of  <whatever>
versus the baseline level of group while controlling for v1 and v2. In
"statistical hyperspace" the estimate (on the log odds scale) for a
multinomial variable is the distance between hyper-planes of the
estimates for the continuous variables.

(And can you stop sending HTML mail to the list, please?)

> -N
>
> On 8/4/09 3:49 PM, David Winsemius wrote:
>>
>>
>> On Aug 4, 2009, at 6:45 PM, Noah Silverman wrote:
>>
>>> I guess I didn't explain it well enough.
>>>
>>> I have a number of training examples.  They have 4 fields.
>>> label, v1, v2, group
>>>
>>> The label is binary ("yes", "no")
>>>
>>> My  understanding (Quite possible wrong.) was that there was a way
>>> to train the LR to estimate probabilities "per group"
>>>
>>> In pseudo-code it would be:
>>> lrm( label ~ v1 + v2, group_by(group)
>>>
>>
>> Why not :
>>
>> lrm( label ~ v1 + v2 + group)
>>
>> ?
>>
>>>
>>> On 8/4/09 3:41 PM, David Winsemius wrote:
>>>>
>>>>
>>>> On Aug 4, 2009, at 6:38 PM, Noah Silverman wrote:
>>>>
>>>>> Thanks David,
>>>>>
>>>>> But HOW do I indicate the "grouping" variable in the formula?
>>>>
>>>> Hard to tell. You have told us absolutely nothing about the
>>>> problem. Discrete variables cause no problems in formulas.
>>>> Perhaps one of :
>>>>
>>>> ?factor
>>>> ?cut
>>>> ?quantile
>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> -N
>>>>>
>>>>> On 8/4/09 3:37 PM, David Winsemius wrote:
>>>>>>
>>>>>>
>>>>>> On Aug 4, 2009, at 6:33 PM, Noah Silverman wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Trying to setup a logistic regression model.  (Something new
>>>>>>> to me. I
>>>>>>> usually use SVM.)
>>>>>>>
>>>>>>> The person explaining the concept explained to me that I can
>>>>>>> include a
>>>>>>> "group" variable so that the probabilities predicted by the
>>>>>>> model will
>>>>>>> be "per group"
>>>>>>>
>>>>>>> Does this make sense to anyone?
>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>> If so, how would I implement this?
>>>>>>> Using the glm or lrm function?
>>>>>>
>>>>>> Yes.
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

```