[R-sig-ME] When can the intercept be removed from regression models
Martin Maechler
maechler at stat.math.ethz.ch
Tue Jul 26 12:08:02 CEST 2016
>>>>> Shadiya Al Hashmi <saah500 at york.ac.uk>
>>>>> on Tue, 26 Jul 2016 12:40:26 +0300 writes:
> Thanks Thierry for your response. I tried the model
> before and after removing the intercept a while ago and I
> remember that the coefficients were pretty much the same.
but other things are *not* pretty much the same, and you
really really really should obey the advice by Thierry:
ALWAYS KEEP THE INTERCEPT IN THE MODEL !!!
(at least until you become a very experience stastician / data
scientist / .. )
>> p-value doesn't matter.
> The only salient difference was that the levels of
> the first categorical variable in the model formula were
> all given in the output table instead of the reference
> level being embedded in the intercept as in the model with
> intercept.
> It would be nice to find examples from the literature
> where the intercept is removed from the model.
hopefully *not*! at least not apart from the exceptions that
Thierry mentions below.
> Can you think of any?
> Shadiya
> Sent from my iPhone
>> On Jul 26, 2016, at 11:32 AM, Thierry Onkelinx
>> <thierry.onkelinx at inbo.be> wrote:
>>
>> Dear Shadiya,
>>
>> Thou shall always keep the intercept in the model. Its
>> p-value doesn't matter.
>>
>> I use two exceptions against that rule: 1. There is a
>> physical/biological/... reason why the intercept should
>> be 0 2. Removing the intercept gives a different, more
>> convenient parametrisation (but not does not changes the
>> model fit!)
>>
>> Note that in logistic regression you use a logit
>> transformation. Hence forcing the model thru the origin
>> on the logit scale, forces the model to 50% probability
>> at the original scale. I haven't seen an example where
>> that makes sense.
>>
>> Bottom line: only remove the intercept when you really
>> know what you are doing.
>>
>> Best regards,
>>
>> ir. Thierry Onkelinx Instituut voor natuur- en
>> bosonderzoek / Research Institute for Nature and Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics &
>> Quality Assurance Kliniekstraat 25 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done
>> may be no more than asking him to perform a post-mortem
>> examination: he may be able to say what the experiment
>> died of. ~ Sir Ronald Aylmer Fisher The plural of
>> anecdote is not data. ~ Roger Brinner The combination of
>> some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a
>> given body of data. ~ John Tukey
>>
>> 2016-07-26 9:50 GMT+02:00 Shadiya Al Hashmi
>> <saah500 at york.ac.uk>:
>>> Good morning,
>>>
>>> I am in a dilemma regarding the inclusion of the
>>> intercept in my mixed effects logistic regression
>>> models. Most statisticians that I talked to insist that
>>> I shouldn’t remove the constant from my models. One of
>>> the pros is that the models would be of good fit since
>>> the R2 value would be improved. Conversely, removing the
>>> constant means that there is no guarantee that we would
>>> end up in getting biased coefficients since the slopes
>>> would be forced to originate from the 0.
>>>
>>> I found only one textbook which does not state it but
>>> rather seems to imply that sometimes we can remove the
>>> constant. This is the reference provided below.
>>>
>>> Cornillon, P.A., Guyader, A., Husson, F., Jégou, N.,
>>> Josse, J., Kloareg, M., LOber, E and Rouviére,
>>> L. (2012). *R for Statistics*: CRC Press. Taylor &
>>> Francis Group.
>>>
>>>
>>>
>>> On p.136, it says that “The p-value of less than 5% for
>>> the constant (intercept) indicates that the constant
>>> must appear in the model”. So based on this, I am
>>> assuming that a p-value of more than 5% for the
>>> intercept would mean that the intercept should be
>>> removed.
>>>
>>> I would appreciate it if someone could help me with this
>>> conundrum.
>>>
>>> --
>>> Shadiya
More information about the R-sig-mixed-models
mailing list