[R-sig-ME] A question about multicollinear fixed/random factors

N o s t a l g i a kenj|ro @end|ng |rom @ho|n@@c@jp
Sat Aug 20 10:40:36 CEST 2022


I am looking at a character variation in Japanese parliamentary 
minutes where the same character appears in two forms. In the 
parliament, there are a number of different committee meetings within 
the same session, and I am looking at 31 sessions over 10 years. The 
factors I am considering are: upper/lower house distinction, meetings 
(meetings within each session, which are different from session to 
session), days between 1949/5/20 (when the first parliament was held) 
and the meeting, and the word within which the character appears. Of 
these, meetings and the words are random factors, and they have 
hundreds of levels. The total number of cases is over one million.

The model I am considering is:

glmer (character ~ ul + days + (1|word) + (1|meeting), data = 
glmmdata.1, family = binomial)

And here is my question: Since a given meeting is a unique one not 
only in each session but in all the data, there would be a 
multicollinear relationship between the days and the meeting, so that 
specification of some meeting would necessarily result in a specific 
value of days. Is it a problem in GLMM to have such pair of fixed and 
random factors? If it is so, is there any ways to avoid the problem?

Thanks in advance,

Kenjiro Matsuda
Professor in Linguistics
Kobe Shoin Women's University
Kobe, Japan

More information about the R-sig-mixed-models mailing list