[R-sig-ME] Model Definition and Interpretation - Interactions, plus Singularity

Wed Mar 6 19:53:34 CET 2019

Hi all, I hope all is well.

I'll try to keep this as brief as possible.

I am trying to fit a mixed logistic regression to the following data structure. The names of the variables are at the end.

- 100+ Students interacted with/reviewed 100 items each, everyone saw the same items but in different order. (ID)

- The response variable is binary (correct or not) (correct)

- The order is coded by a variable from 1-100 and it is common to everyone, although the item ID is different for each person. (Sequence)

- The items belong to different categories and the number of items by category is not the same, i.e., ItemTypeA has 20 items, TypeB has 15 etc. (caseType), Total unique items = 100. As a simple, preliminary case I am sorting the items into only two categories, in such a way that they are as balanced as possible, approx. 56 vs 44.

The goal of the study is to analyze change in performance of the students when reviewing an increasing number of items of different kinds. We know theoretically that students have different initial abilities, and that they learn at different rates. Additionally, the different types of items are themselves of varying "difficulty" (we showed it using IRT), so we expect/assume that because each person reviewed the same items but in different order, we can expect to see different performance curves both by person and by case type, hence the random coefficients logistic models presented below.

1. glmer(correct ~ scale(Sequence) + (scale(Sequence) | ID:caseType), family = binomial)

2. glmer(correct ~ scale(Sequence):caseType + (scale(Sequence):caseType | ID), family = binomial)

I've fitted these two models to capture the different learning rates by person and by case type but I am not sure about, first, if the interaction is correctly specified, and second, where and how to specify the interaction given the needs of my problem (person-case or # items-case, random or fixed). Are cases nested within persons, even if the number of items by case differs? Or is the interaction of case type with the number on the sequence more informative for my purpose?

The first model's coef()/ranef() output is very attractive since I can have an Intercept and a Slope for the interaction of person and case type, however after carefully reviewing the answers in this discussion<https://stats.stackexchange.com/questions/31569/questions-about-how-random-effects-are-specified-in-lmer>, I moved to model number 2 since it made more sense in the interpretation, however I am unsure which is more appropriate for my needs. I am starting to get more inclined towards the second model but it is a singular fit (+1 correlation of random effects). I've looked for possible solutions without the need to go Bayesian, but I am not sure how to implement those either so I tried going to rstanarm. Are there any suggestions about the priors?

I will continue to try out the different suggestions presented in the different threads around singularities on lme4.

Finally, I looked for a suitable dataset for reproducibility but I hope this is more of a conceptual discussion.

Similar questions about singular fits: https://stats.stackexchange.com/questions/378939/dealing-with-singular-fit-in-mixed-models

[https://cdn.sstatic.net/Sites/stats/img/apple-touch-icon@2.png?v=344f57aa10cc]<https://stats.stackexchange.com/questions/378939/dealing-with-singular-fit-in-mixed-models>

lme4 nlme - Dealing with singular fit in mixed models - Cross Validated<https://stats.stackexchange.com/questions/378939/dealing-with-singular-fit-in-mixed-models>
stats.stackexchange.com
Let's say we have a model mod <- Y ~ X*Condition + (X*Condition|subject) # Y = logit variable # X = continuous variable # Condition = values A and B, dummy coded; the design is repeated ...

Thank you in advance,

Best,

Ilan Reinstein

------------------------------------------------------------
This email message, including any attachments, is for th...{{dropped:14}}