[R-sig-ME] Speeding up finding non-singular mixed effect models

Wed Jul 31 19:49:02 CEST 2024

Hello all,

Not sure if this is necessarily an appropriate venue for this but I wanted
to ask people's thoughts on some work Philip Greengard and I have been
doing on how to more efficiently select random effects in order to find a
parsimonious non-singular model based only on the data and not theoretical
concerns (i.e. the situation described in the Parsimonious Mixed Models
paper (https://arxiv.org/abs/1506.04967)).

We have been testing out the idea of using interpolative decomposition (
https://epubs.siam.org/doi/10.1137/030602678
<https://urldefense.proofpoint.com/v2/url?u=https-3A__epubs.siam.org_doi_10.1137_030602678&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=1fI6tPTeZTtBK3UMOX1dxBGvjRjIAz3vir-OsKyH_TA&m=MM3jgHoRDStULujYcqcdTwPTyxlZ0rtlBxrUlqlUgo2_0adaJXDJXAOFyxj5X7CI&s=OwdbvLvQ_pyDTMBCi-u94EJbIiOwRR-2s-7TukE5vEQ&e=>)
as a way of selecting random effects in order to reduce a singular model
without refitting the model multiple times. Interpolative decomposition
finds the columns (or rows) of a matrix that approximately span the column
(or row) space. Our thinking is that this decomposition is a natural tool
for selecting which parameters to include in a non-maximal model.

We played around with data from Gann and Barr (2012) available in the
RePsychLing package (associated with the Parsimonious Mixed Models paper).
In the GB vignette from tthe package, the analysis started with a maximal
model that included four parameters varying by session and four varying by
item. PCA indicated "two dimensions with no variability in the random
effects for session and another two dimensions in the random effects for
item."
The original model is sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + (1 +
T + F + TF | session) + (1 + T + P + TP | item)
Using the algorithm described in the Parsimonious Mixed Models paper
requires fitting the model five times to arrive at a non-singular model.
The final model is:
sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + (1 + F | session) + (0 + T
| session) + (1 | item).

In our approach, after fitting the maximal model, we apply an interpolative
decomposition to both session and item random effects covariance matrices
to find the best 2x2 submatrices:
sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + (1 + F | session) + (1 + P
| item). However, when we refit, we find that the item random effects are
still not full rank. We run the process again and find a final model:
sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + (1 + F | session) + (1 |
item). Assuming that we do not allow for forcing off-diagonal elements to
be 0, this final model matches the model found previously with only three
model fits instead of five.

We also applied the same approach to the Kliegel et al. (2011) dataset
(from the same package) and were able to find the same final model as the
KWDYZ vignette (in the package) using only two model fits rather than three.

With large models, this method could potentially save a substantial amount
of time! But we don't fully understand why initial interpolative
decomposition based on PCA of the full maximal model still results in a
singular fit. We tried applying interpolative decomposition to each batch
of random effects iteratively but this still got us the same result. We'd
be very curious to hear thoughts on the overall approach and any intuition
people might have for why the rank of the random effects matrices seems to
change so much after refitting. Code for our experiments can be found here:
https://github.com/dhalpern/lmer_id/blob/main/lmer_id_experiment_GB.R,
https://github.com/dhalpern/lmer_id/blob/main/lmer_id_experiment_KWDYZ.R.

Thanks so much for any comments or ideas!

Best,
David

	[[alternative HTML version deleted]]