[R] glm.fit: fitted probabilities numerically 0 or 1 occurred & glm.fit: algorithm did not converge
David Winsemius
dwinsemius at comcast.net
Fri Aug 12 21:37:00 CEST 2016
> On Aug 12, 2016, at 11:32 AM, Shivi Bhatia <shivipmp82 at gmail.com> wrote:
>
> Hi Michael,
>
> In all the masking process some of the variables were missed. Please find
> the updated file.
>
> Also here is the updated code: (i am removed one of the var as it had
> missing information):
>
> glm.fit= glm(survey ~ support_cat + region+ support_lvl+ skill_group+
> application_area+ functional_area+
> repS+ case_age+ case_status+ severity_level+
> sla_status, data = new, family = binomial)
I think you need to do some more data cleaning:
> with(new, table(survey, repS, severity_level) )
, , severity_level =
repS
survey 0 1
0 0 0
1 0 0
, , severity_level = high
repS
survey 0 1
0 52 18
1 4 193
, , severity_level = medium
repS
survey 0 1
0 69 16
1 7 367
, , severity_level = no
repS
survey 0 1
0 0 0
1 0 1
, , severity_level = none
repS
survey 0 1
0 31 19
1 4 183
--
David.
> Kindly assist with the same.
>
> On Fri, Aug 12, 2016 at 11:05 PM, Michael Dewey <lists at dewey.myzen.co.uk>
> wrote:
>
>> Your example code refers to a variable which is not in your dataset (repS)
>> so I get an error message. If I assume repS is in fact rep_score I get
>> another variable not found (delivery_segmentation).
>>
>> I am afraid that I am unable to sort that one out so this is going to
>> remain a mystery. I endorse Bert's suggestion of getting local help.
>>
>> On 12/08/2016 17:24, Shivi Bhatia wrote:
>>
>>> Hi Bert,
>>>
>>> Does this text file help. Apologies if this does not help as i have a
>>> hard time on many occasions to get a reproducible example.
>>>
>>> If this doesn't work a CSV with only 100kb of data i can share.
>>>
>>> Regards, Shivi
>>>
>>> On Fri, Aug 12, 2016 at 8:50 PM, Shivi Bhatia <shivipmp82 at gmail.com
>>> <mailto:shivipmp82 at gmail.com>> wrote:
>>>
>>> Sure Burt, i will share the data after masking it. it isn't big
>>>
>>> regards, Shivi
>>>
>>> On Fri, Aug 12, 2016 at 8:36 PM, Bert Gunter <bgunter.4567 at gmail.com
>>> <mailto:bgunter.4567 at gmail.com>> wrote:
>>>
>>> 1. No, changing to factor will make no difference.
>>>
>>> 2. I think that most likely your problem is your model is not
>>> estimable/your design matrix is singular. You should resolve
>>> this by
>>> consulting with a local statistical expert or, if your data set
>>> is not
>>> too large or confidential, posting your full dataset using
>>> dput() (see
>>> ?dput for how to do this).
>>>
>>> Cheers,
>>> Bert
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming
>>> along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Fri, Aug 12, 2016 at 7:58 AM, Shivi Bhatia
>>> <shivipmp82 at gmail.com <mailto:shivipmp82 at gmail.com>> wrote:
>>>> Hi Michael,
>>>>
>>>> There is no output as the model does not generate any
>>> coefficients and
>>>> simply throws this error.
>>>>
>>>> I hope you are not asking for a reproducible example.
>>>>
>>>> On Fri, Aug 12, 2016 at 7:30 PM, Michael Dewey
>>> <lists at dewey.myzen.co.uk <mailto:lists at dewey.myzen.co.uk>>
>>>
>>>> wrote:
>>>>
>>>>> Dear Shivi
>>>>>
>>>>> Can you show us the output?
>>>>>
>>>>> And please do not post in HTML as it will mangle your post into
>>>>> unreadability.
>>>>>
>>>>> On 12/08/2016 10:10, Shivi Bhatia wrote:
>>>>>
>>>>>> Hi Team,
>>>>>>
>>>>>> I am creating *my first* Logistic regression on R Studio. I
>>> am working on
>>>>>> a
>>>>>>
>>>>>> C-SAT data where rating (score) 0-8 is a dis-sat whereas
>>> 9-10 are SAT. As
>>>>>> these were in numeric form so i had as below created 2
>>> classes:
>>>>>>
>>>>>> new$survey[new$score>=0 & new$score<=8]<- 0
>>>>>> new$survey[new$score>=9]<- 1
>>>>>> This works fine however the class still shows as "numeric"
>>> and levels
>>>>>> shows
>>>>>> as "NULL". Do i still need to use "as.factor" to let R know
>>> these are
>>>>>> categorical variables.
>>>>>>
>>>>>> Also i have used the below code to run a logistic regression
>>> with all the
>>>>>> possible predictor variables:
>>>>>> glm.fit= glm(survey ~ support_cat + region+ support_lvl+
>>> skill_group+
>>>>>> application_area+ functional_area+
>>>>>> repS+ case_age+ case_status+ severity_level+
>>>>>> sla_status+ delivery_segmentation, data = SFDC,
>>> family =
>>>>>> binomial)
>>>>>>
>>>>>> But it throws an error:-
>>>>>> Warning messages:
>>>>>> 1: glm.fit: algorithm did not converge
>>>>>> 2: glm.fit: fitted probabilities numerically 0 or 1 occurred
>>>>>>
>>>>>> I checked online for the error and it says:
>>>>>> "glm() uses an iterative re-weighted least squares
>>> algorithm. The
>>>>>> algorithm
>>>>>> hit the maximum number of allowed iterations before signalling
>>>>>> convergence.
>>>>>> The default,
>>>>>> documented in ?glm.control is 25."
>>>>>>
>>>>>> Kindly suggest on the above case and if i have to change my
>>> outcome var as
>>>>>> as.factor.
>>>>>>
>>>>>> Thank you, Shivi
>>>>>>
>>>>>> [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing
>>> list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> <https://stat.ethz.ch/mailman/listinfo/r-help>
>>>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posti
>>>>>> ng-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible
>>> code.
>>>>>>
>>>>>>
>>>>> --
>>>>> Michael
>>>>> http://www.dewey.myzen.co.uk/home.html
>>> <http://www.dewey.myzen.co.uk/home.html>
>>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing
>>> list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> <https://stat.ethz.ch/mailman/listinfo/r-help>
>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> <http://www.R-project.org/posting-guide.html>
>>>> and provide commented, minimal, self-contained, reproducible
>>> code.
>>>
>>>
>>>
>>>
>> --
>> Michael
>> http://www.dewey.myzen.co.uk/home.html
>>
> <saved.txt>______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list