[R-sig-ME] keeping both numerically and factor coded factors

Thu Aug 1 19:38:02 CEST 2019

  I generally agree with Robert's point of view - I don't *necessarily*
object to removing correlations, but you have to think carefully about
what it means.

  As to the question of "how should I put more than one factor into a
compound symmetric model"?: suppose you want to make (L*V*D|subjects)
compound symmetric.  You (unfortunately) have a variety of choices.  If
you really want all CS interactions represented, I think you need the
equivalent of (1|subjects/(L+V+D)^2) (which probably won't work as
written, i.e.

 (1|subjects) +
 (1|subjects:L) + (1|subjects:V) + (1|subjects:D) +
 (1|subjects:L:V) + (1|subjects:V:D) + (1|subjects:D:L)

(if you included the (1|subjects:L:V:D) term it would be redundant with
the residual error term).  This is getting complex again -- 7 parameters
(still much better than (L*V*D|subjects), which gives you (16*17)/2 =
136 parameters to estimate) ...

  I'm not sure rstanarm will solve your problems.  That is, I don't see
how the convergence diagnostics that rstanarm gives you are going to be
much more useful than lme4's in deciding how to simplify the problem.
On the other hand, rstanarm offers a big advantage in allowing you to
set priors to keep the solutions to the fitted problem more realistic -
it also integrates over the uncertainty in a useful way.

  [Robert: sorry if I missed or misconstrued something in your answer.
Could you be a little more specific in how you would use rstanarm's
output & diagnostics to help solve this kind of problem?]

 Ben Bolker

On 2019-08-01 10:02 a.m., Robert Long wrote:
> Dear Elisa,
> 
> Yes, one of the possible steps is to force correlations to zero, but then
> you are imposing (possibly unreasonable) constraints at the cost of trying
> to make the model converge. It is a highly questionable procedure to remove
> something or impose constraints purely to cause a model to converge. Random
> variables that arise in nature as part of the same data generating process
> are rarely uncorrelated. It may be that the correlations are small and
> /can/ reasonably be set to zero, but you should investigate whether this is
> reasonable first.
> 
> Removing random slopes is usually a good way to proceed.
> 
> If you can't make progress this way you could try the rstanarm package
> which provides a drop in replacement for lmer and will fit the model using
> a Bayesian approach. Then, the convergence diagnostics should provide a
> better way to solve the problem. It may be that one or more of the variance
> components and/or correlations between them are close to zero, in which
> case you can remove them from the random structure.
> 
> 
> On Wed, 31 Jul 2019, 09:14 MONACO Elisa, <elisa.monaco using unifr.ch> wrote:
> 
>> Thank you,
>>
>> Robert Long, I think we are claiming the same idea: the maximal model is
>> too complex (overparameterized and with a degenerate/singular solution) and
>> I want to reduce the random structure, following the steps suggested by
>> Bates et al.. Am I correct?
>> However one of these steps it's indeed "forcing to zero the correlation
>> parameters" and check the good fit of the consequent model. Therefore my
>> question on how to arrange my D factor in the random structure.
>>
>> I still don't know how to handle CS model suggested by Bolker ((1|g/f))
>> and how to integrate more factors in that structure ((f1*f2|g/f3)?) ... any
>> suggestions would be much appreciated!
>>
>> Elisa Monaco
>>
>>
>> -----Message d'origine-----
>> De : R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> De la
>> part de Robert Long
>> Envoyé : mercredi, 24 juillet 2019 10:33
>> À : R-mixed models mailing list <r-sig-mixed-models using r-project.org>
>> Objet : Re: [R-sig-ME] keeping both numerically and factor coded factors
>>
>> It is quite possible that such a complex random structure will not be
>> supported by the data.
>>
>> In your initial email you mentioned correlations between random effects.
>> However, since the model did not converge, there is no point in
>> intetpreting them. Moreover, to force them to be uncorrelated is possibly
>> making unrealistic constraints on the model.
>>
>> Why do seek such a complex random structure ? If you are following the
>> advice by Barr et al (2013) to "keep it maximal", this is often very poor
>> advice, as noted by Bates et al (2015), Bates being the primary author of
>> the lme4 package:
>>
>> https://arxiv.org/pdf/1506.04967
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, 24 Jul 2019, 09:01 MONACO Elisa, <elisa.monaco using unifr.ch> wrote:
>>
>>> Dear all,
>>> many thanks for your answers and sorry for not providing the details.
>>>
>>> My experiment is a 2X2X4 within subject design, with all three factors
>>> being categorical: L=Language of the stimuli (2 levels), V= type of
>>> the stimuli (2 levels), D= delay of brain stimulation (4 levels). My
>>> dependent variable is the amplitude of a physiological measure.
>>>
>>> I thought to build my maximal mixed model in which all the factors are
>>> crossed within subjects and only D is crossed within items (items are
>>> the same, repeated at different delays of stimulation):
>>>
>>> lmer(MEPzed ~ L * V * D  + (D|items), data=mydata,
>>> control=lmerControl(optCtrl=list(maxfun=1e6)))
>>>
>>> So, to answer @Robert Long: my factor D I was referring to is a random
>>> slope, with4 levels
>>>
>>> to answer using Ben Bolker:
>>> indeed I don't think that my factor D falls in the 2 cases you
>>> mentioned,
>>> because:
>>>  a) the differences between each level is not the same for each level
>>> (150ms-75ms-75ms-150ms) and we don't expect en effect ordered in time,
>>> we expect the effect to be present at one or more latencies depending
>>> on L;
>>> b) the factor has more than two levels.
>>>
>>> According to all of this, I should go for a CS model, right?
>>> I'm a newbie in this field, so can you please give me some indications
>>> of what can I read about it or some indications to understand how to
>>> handle this (especially if I want to reduce gradually the random
>>> structure of the subjects part, see modelreduced2)/?
>>>
>>> modelreduced1: lmer(MEPzed ~ L * V * D + (L*V*D|subjects) +
>>> (1|items/D), data=mydata,
>>> control=lmerControl(optCtrl=list(maxfun=1e6)))
>>>
>>> modelreduced2: lmer(MEPzed ~ L * V * D + (L*V|subjects/D) +
>>> (1|items/D), data=mydata,
>>> control=lmerControl(optCtrl=list(maxfun=1e6)))
>>>
>>>
>>> Another point: is this semplification indipendent of which type of
>>> contrast I set for D (I'll set sum contrast for V and L, but I'm still
>>> reasoning on what is the best for D)?
>>>
>>> Thank you in advance for this big help and please tell me if you need
>>> further clarifications or code.
>>>
>>>  Elisa Monaco | PhD student
>>> ________________________________________
>>> De : R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> de
>>> la part de Ben Bolker <bbolker using gmail.com> Envoyé : lundi 22 juillet
>>> 2019 17:56 À : r-sig-mixed-models using r-project.org Objet : Re: [R-sig-ME]
>>> keeping both numerically and factor coded factors
>>>
>>>   Elisa,
>>>
>>>   Can you say a little more about what your factor represents?
>>>
>>>   It probably *doesn't* make sense to collapse your factor to an
>>> integer for the purpose of allowing a diagonal covariance matrix, unless:
>>>
>>>  * it's reasonable to treat the factor levels as sequential values
>>> with equal differences between each successive pair (e.g., time), OR
>>>  * the factor only has two levels anyway
>>>
>>>   Another simplifying strategy is to use a compound-symmetric model
>>> (equal correlations among all pairs of levels): if your original model
>>> is (f|g) (where f is a factor and g is your grouping variable), then
>>> (1|g/f) will generate a CS model.
>>>
>>>   cheers
>>>     Ben Bolker
>>>
>>>
>>> On 2019-07-22 10:24 a.m., Robert Long wrote:
>>>> Dear Elisa
>>>>
>>>> Is this factor a grouping variable (for random intercepts) or a
>>>> random slope ? How many levels does it have ? And lease can you give
>>>> us the full model formula.
>>>>
>>>>
>>>>
>>>> On Mon, 22 Jul 2019, 12:17 MONACO Elisa via R-sig-mixed-models, <
>>>> r-sig-mixed-models using r-project.org> wrote:
>>>>
>>>>> Dear list,
>>>>> looking at the correlation values of my random effects, as well as
>>>>> the fact that my model fails to converge, it makes sense to me to
>>>>> simplify
>>> its
>>>>> random structure (while keeping maximal and according to our hp the
>>> fixed
>>>>> structure).
>>>>> One way is to remove correlations, and I know that the || notation
>>>>> works only with numerically coded factors.
>>>>> As far as I understood, I have two options:
>>>>> 1) use the package afex, putting my model as object of mixed and
>>>>> adding "expand_re=true"
>>>>> 2) use the original factor, by default read as "int"
>>>>>
>>>>> I want to use the option 2) because with mixed I can't apply the
>>>>> PCA function for random effects to check if my model is over
>> parameterized.
>>>>>
>>>>> My questions are:
>>>>> a)    is it true that I can use my factor as it is when read by R,
>> i.e.
>>>>> "int"?
>>>>> b)    if yes, does it make sense to keep in the model both the factor
>> in
>>>>> the nominal form as fixed effect and the factor in the numerical
>>>>> form as random effect?
>>>>>
>>>>> Many thanks for your help,
>>>>>
>>>>> Elisa Monaco | PhD student
>>>>>
>>>>>         [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-mixed-models using r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>
>>>>
>>>>       [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>