[R-sig-ME] Model Definition and Interpretation - Interactions, plus Singularity

Wed Mar 13 02:04:04 CET 2019

Hi Harold and Phillip, thanks for your reply and sorry for the delay.

Your insight was quite helpful. Some comments as I may have not been clear before, I apologize.

Harold - the difficulty estimates I mentioned were obtained by fitting an LLTM to our data. That is how we know the case types are different, additionally, this model provides item-level difficulty estimates as well. The problem is, we are only interested in capturing the change in ability at the case type level, suggesting that the subjects actually understand things at that level (type-vs-type, not item-by-item). I read the reference you sent, and although it was interesting, we are hitting some challenges when taking the IRT path for this analysis, so I am not sure that will help us.

Phillip- Thanks for pointing out the incorrect definition of the interaction I was using, the model makes more sense as you suggested. However, for both cases there is nor convergence, in fact, sometimes it is a singular fit while in other situations it just does not converge. Will using a different optimizer and setting the calc.derivs = FALSE in glmerControl help?

>> If each item belongs uniquely to a single caseType, then you can't estimate caseType by item effects, but you could potentially estimate sequence effects:

>> This latter model would pick up e.g. whether a given item is more or less likely to be corrected rated, beyond the effect of its overarching caseType or sequence position. lme4 will pick up on the "between" design aspects of the item component without you having to be more explicit.

Also, I am not sure I understand what you mean by sequence and case type effects over items. In this particular case, the number at which the item is presented (sequence) _does_ matter, since we are looking for different changes in ability by person, which in turn are set to be different according to the case type. What is the advantage and the interpretation of the inclusion of the variation over items, whether as a slope or an intercept?

Thanks again for your valuable input, it has been quite enriching in my lme4 journey.

Best,

Ilan

________________________________
From: Doran, Harold <HDoran using air.org>
Sent: Thursday, March 7, 2019 11:05:47 AM
To: Phillip Alday; Reinstein, Ilan; r-sig-mixed-models using r-project.org
Subject: RE: [R-sig-ME] Model Definition and Interpretation - Interactions, plus Singularity

Ilan

I was reading this a bit and I was wondering if you are estimating what you actually wanted to estimate? I was uncertain. It sounded from the description you provided that you wanted to measure change in abilities? But it appears from the lmer call you're estimating item effects (but you already have item parameters for those items?)

That is, your code reminds me of the paper we published on this some time ago, link below.

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.jstatsoft.org_article_view_v020i02&d=DwIF-g&c=j5oPpO0eBH1iio48DtsedeElZfc04rx3ExJHeIIZuCs&r=D5erugTFg_izZGHSPIkFaZ8YL0JMFUHxjgMAYvCYYrc&m=ldVgcdweF3eXwKk7FsT_NXn8xCq_0yaW3zrmDzjgSVU&s=kNS6-nqxgh8MEHVNpEDx9gvVkRSf5TfDutE9_P5IsLw&e=

If you're interested in measuring change in abilities, you might consider direct estimation (like used in NAEP).

-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> On Behalf Of Phillip Alday
Sent: Thursday, March 07, 2019 10:57 AM
To: Reinstein, Ilan <Ilan.Reinstein using nyulangone.org>; r-sig-mixed-models using r-project.org
Subject: Re: [R-sig-ME] Model Definition and Interpretation - Interactions, plus Singularity

Hello,

On 6/3/19 7:53 pm, Reinstein, Ilan wrote:
> Hi all, I hope all is well.
>

::snip::

>
>
> 1. glmer(correct ~ scale(Sequence) + (scale(Sequence) | ID:caseType),
> family = binomial)
>
> 2. glmer(correct ~ scale(Sequence):caseType +
> (scale(Sequence):caseType | ID), family = binomial)

This model is a little unusual -- it includes interactions without main effects, which usually doesn't make sense.

I think you want:

glmer(correct ~ 1 + scale(Sequence)*caseType + (1+scale(Sequence)*caseType | ID), family = binomial)

(Note: I always make my intercept term explicit, both to remind that it's there, and because some of the other software I use doesn't add an implicit intercept.)

I wouldn't worry about achieving balance via artificial mehtods -- lme4 doesn't require it and the lack of balance will primarily show itself as a difference in the intercept term, which isn't a big deal, and to a lesser extent a difference in the standard errors -- they may actually be better because you have more data overall.

Finally, you have 15ish (or at least more than 10 based on your description and a logical leap) items per category, so you have more than enough items to also estimate item effects. If each item belongs uniquely to a single caseType, then you can't estimate caseType by item effects, but you could potentially estimate sequence effects:

glmer(correct ~ 1 + scale(Sequence)*caseType + (1+scale(Sequence)*caseType | ID) + (1+scale(Sequence)|Item), family =
binomial)

and if that fails to converge, you can try a model which just allows for just intercept-level variation by item:

glmer(correct ~ 1 + scale(Sequence)*caseType + (1+scale(Sequence)*caseType | ID) + (1|Item), family = binomial)

This latter model would pick up e.g. whether a given item is more or less likely to be corrected rated, beyond the effect of its overarching caseType or sequence position. lme4 will pick up on the "between" design aspects of the item component without you having to be more explicit.

About your convergence problems: assuming they still linger after adding in the main effects, I would interpret them as there not being enough data to estimate the by-subject and by-item differences in the sequence effect, which isn't horrible nor particularly surprising for me: I expect the overall effect of sequence and the general variation between subjects and items (i.e. the corresponding random intercepts) to be much larger than the variation between subjects and items in the sequence effect.

Best,
Phillip

>
> I've fitted these two models to capture the different learning rates by person and by case type but I am not sure about, first, if the interaction is correctly specified, and second, where and how to specify the interaction given the needs of my problem (person-case or # items-case, random or fixed). Are cases nested within persons, even if the number of items by case differs? Or is the interaction of case type with the number on the sequence more informative for my purpose?
>
>
> The first model's coef()/ranef() output is very attractive since I can have an Intercept and a Slope for the interaction of person and case type, however after carefully reviewing the answers in this discussion<https://urldefense.proofpoint.com/v2/url?u=https-3A__stats.stackexchange.com_questions_31569_questions-2Dabout-2Dhow-2Drandom-2Deffects-2Dare-2Dspecified-2Din-2Dlmer&d=DwIF-g&c=j5oPpO0eBH1iio48DtsedeElZfc04rx3ExJHeIIZuCs&r=D5erugTFg_izZGHSPIkFaZ8YL0JMFUHxjgMAYvCYYrc&m=ldVgcdweF3eXwKk7FsT_NXn8xCq_0yaW3zrmDzjgSVU&s=VwS7CoKV7lTomAkPsjwT5esdPBSGnsn6JmR5tHjDRzE&e=>, I moved to model number 2 since it made more sense in the interpretation, however I am unsure which is more appropriate for my needs. I am starting to get more inclined towards the second model but it is a singular fit (+1 correlation of random effects). I've looked for possible solutions without the need to go Bayesian, but I am not sure how to implement those either so I tried going to rstanarm. Are there any suggestions about the priors?
>
>
> I will continue to try out the different suggestions presented in the different threads around singularities on lme4.
>
>
> Finally, I looked for a suitable dataset for reproducibility but I hope this is more of a conceptual discussion.
>
>
> Similar questions about singular fits:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stats.stackexchange.com_questions_378939_dealing-2Dwith-2Dsingular&d=DwIF-g&c=j5oPpO0eBH1iio48DtsedeElZfc04rx3ExJHeIIZuCs&r=D5erugTFg_izZGHSPIkFaZ8YL0JMFUHxjgMAYvCYYrc&m=ldVgcdweF3eXwKk7FsT_NXn8xCq_0yaW3zrmDzjgSVU&s=8rAnN6AMYnzzz-dAXUaWSqX1QuRYd4jezzQgkn_21Ho&e=
> -fit-in-mixed-models
>
>
> [https://urldefense.proofpoint.com/v2/url?u=https-3A__cdn.sstatic.net_Sites_stats_img_apple-2Dtouch-2Dicon-402.png-3Fv-3D344f&d=DwIF-g&c=j5oPpO0eBH1iio48DtsedeElZfc04rx3ExJHeIIZuCs&r=D5erugTFg_izZGHSPIkFaZ8YL0JMFUHxjgMAYvCYYrc&m=ldVgcdweF3eXwKk7FsT_NXn8xCq_0yaW3zrmDzjgSVU&s=_zS-mUiv6ehsv2uVG2lPuf62u-obBPrjlM1mspZ9EvA&e=
> 57aa10cc]<https://urldefense.proofpoint.com/v2/url?u=https-3A__stats.stackexchange.com_questions_378939_dealing-2Dwit&d=DwIF-g&c=j5oPpO0eBH1iio48DtsedeElZfc04rx3ExJHeIIZuCs&r=D5erugTFg_izZGHSPIkFaZ8YL0JMFUHxjgMAYvCYYrc&m=ldVgcdweF3eXwKk7FsT_NXn8xCq_0yaW3zrmDzjgSVU&s=tWwIjahfJzsNIBdq-LumQ2TQp35LciuRwmIC5BUao00&e=
> h-singular-fit-in-mixed-models>
>
> lme4 nlme - Dealing with singular fit in mixed models - Cross
> Validated<https://urldefense.proofpoint.com/v2/url?u=https-3A__stats.stackexchange.com_questions_378939_dealing-2Dwit&d=DwIF-g&c=j5oPpO0eBH1iio48DtsedeElZfc04rx3ExJHeIIZuCs&r=D5erugTFg_izZGHSPIkFaZ8YL0JMFUHxjgMAYvCYYrc&m=ldVgcdweF3eXwKk7FsT_NXn8xCq_0yaW3zrmDzjgSVU&s=tWwIjahfJzsNIBdq-LumQ2TQp35LciuRwmIC5BUao00&e=
> h-singular-fit-in-mixed-models>
> stats.stackexchange.com
> Let's say we have a model mod <- Y ~ X*Condition + (X*Condition|subject) # Y = logit variable # X = continuous variable # Condition = values A and B, dummy coded; the design is repeated ...
>
>
>
>
> Thank you in advance,
>
>
> Best,
>
>
> Ilan Reinstein
>
>
>
> ------------------------------------------------------------
> This email message, including any attachments, is for ...{{dropped:25}}