[R-sig-ME] When can the intercept be removed from regression models

Tue Jul 26 12:31:12 CEST 2016

Hi,

since all the stats experts are on this list, I have to ask a question
in relation to models without intercept.

In my layman's conception in a model without intercept like this one:

glmer(response ~ 0 + condition + (1 | study_participant ) + (1 |
test_item), data=data_frame, family=binomial,
control=glmerControl(optimizer="bobyqa"))

the levels of the predictor condition are not estimated in relation to
the intercept but against zero absolute. With binomial data this seems
quite handy as for each condition level the model tells me whether
performance was significantly different from chance (like multiple
intercepts), something a binomial test could do as well (albeit
without accounting for the random components structure).
This can be (and in psycholinguistic research often is) a research question.

Or is this total nonsense?

I have to say that I am confused when int comes to the intercepts in
the random components ....

Tom

---

Tom Fritzsche
University of Potsdam
Department of Linguistics
Karl-Liebknecht-Straße 24-25
14476 Potsdam
Germany

office: 14.140
phone: +49 331 977 2296
fax: +49 331 977 2095
e-mail: tom.fritzsche at uni-potsdam.de
web:    www.ling.uni-potsdam.de/~fritzsche

2016-07-26 12:08 GMT+02:00 Martin Maechler <maechler at stat.math.ethz.ch>:
>>>>>> Shadiya Al Hashmi <saah500 at york.ac.uk>
>>>>>>     on Tue, 26 Jul 2016 12:40:26 +0300 writes:
>
>     > Thanks Thierry for your response.  I tried the model
>     > before and after removing the intercept a while ago and I
>     > remember that the coefficients were pretty much the same.
>
> but other things are *not* pretty much the same, and you
> really really really should obey the advice by Thierry:
>
>    ALWAYS KEEP THE INTERCEPT IN THE MODEL !!!
>
> (at least until you become a very experience stastician / data
>  scientist / .. )
>
>
>     >> p-value doesn't matter.
>     >  The only salient difference was that the levels of
>     > the first categorical variable in the model formula were
>     > all given in the output table instead of the reference
>     > level being embedded in the intercept as in the model with
>     > intercept.
>
>     > It would be nice to find examples from the literature
>     > where the intercept is removed from the model.
>
> hopefully *not*!  at least not apart from the exceptions that
> Thierry mentions below.
>
>     > Can you think of any?
>
>     > Shadiya
>
>     > Sent from my iPhone
>
>     >> On Jul 26, 2016, at 11:32 AM, Thierry Onkelinx
>     >> <thierry.onkelinx at inbo.be> wrote:
>     >>
>     >> Dear Shadiya,
>     >>
>     >> Thou shall always keep the intercept in the model. Its
>     >> p-value doesn't matter.
>     >>
>     >> I use two exceptions against that rule: 1. There is a
>     >> physical/biological/... reason why the intercept should
>     >> be 0 2. Removing the intercept gives a different, more
>     >> convenient parametrisation (but not does not changes the
>     >> model fit!)
>     >>
>     >> Note that in logistic regression you use a logit
>     >> transformation. Hence forcing the model thru the origin
>     >> on the logit scale, forces the model to 50% probability
>     >> at the original scale. I haven't seen an example where
>     >> that makes sense.
>     >>
>     >> Bottom line: only remove the intercept when you really
>     >> know what you are doing.
>     >>
>     >> Best regards,
>     >>
>     >> ir. Thierry Onkelinx Instituut voor natuur- en
>     >> bosonderzoek / Research Institute for Nature and Forest
>     >> team Biometrie & Kwaliteitszorg / team Biometrics &
>     >> Quality Assurance Kliniekstraat 25 1070 Anderlecht
>     >> Belgium
>     >>
>     >> To call in the statistician after the experiment is done
>     >> may be no more than asking him to perform a post-mortem
>     >> examination: he may be able to say what the experiment
>     >> died of. ~ Sir Ronald Aylmer Fisher The plural of
>     >> anecdote is not data. ~ Roger Brinner The combination of
>     >> some data and an aching desire for an answer does not
>     >> ensure that a reasonable answer can be extracted from a
>     >> given body of data. ~ John Tukey
>     >>
>     >> 2016-07-26 9:50 GMT+02:00 Shadiya Al Hashmi
>     >> <saah500 at york.ac.uk>:
>     >>> Good morning,
>     >>>
>     >>> I am in a dilemma regarding the inclusion of the
>     >>> intercept in my mixed effects logistic regression
>     >>> models.  Most statisticians that I talked to insist that
>     >>> I shouldn’t remove the constant from my models.  One of
>     >>> the pros is that the models would be of good fit since
>     >>> the R2 value would be improved. Conversely, removing the
>     >>> constant means that there is no guarantee that we would
>     >>> end up in getting biased coefficients since the slopes
>     >>> would be forced to originate from the 0.
>     >>>
>     >>> I found only one textbook which does not state it but
>     >>> rather seems to imply that sometimes we can remove the
>     >>> constant. This is the reference provided below.
>     >>>
>     >>> Cornillon, P.A., Guyader, A., Husson, F., Jégou, N.,
>     >>> Josse, J., Kloareg, M., LOber, E and Rouviére,
>     >>> L. (2012). *R for Statistics*: CRC Press. Taylor &
>     >>> Francis Group.
>     >>>
>     >>>
>     >>>
>     >>> On p.136, it says that “The p-value of less than 5% for
>     >>> the constant (intercept) indicates that the constant
>     >>> must appear in the model”.  So based on this, I am
>     >>> assuming that a p-value of more than 5% for the
>     >>> intercept would mean that the intercept should be
>     >>> removed.
>     >>>
>     >>> I would appreciate it if someone could help me with this
>     >>> conundrum.
>     >>>
>     >>> --
>     >>> Shadiya
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models