[R-sig-ME] Fw: Question about inclusion of a random effect

Wed Aug 9 11:29:11 CEST 2017

Let me address your first question:

> If I remove (1|Question) there is no convergence warning.  Is this an
indication that the variance of this random effect is 0 thereby creating
a problem with the optimizer?  Does this warrant removing this random
effect?  If not, any suggestions on how to proceed with the convergence
issues?

No, this does not in general indicate that the variance of that random
effect is zero, which the optimiser can often find on its own. You can
see this by replacing the different subjects in the lme4 sleepstudy
example with copies of a single subject:

> rep(subset(sleepstudy, Subject == 308),10)
> n <- length(levels(sleepstudy$Subject))
> novar <- do.call(rbind, replicate(n, subset(sleepstudy, Subject ==
308), simplify=FALSE))
> novar$Subject <- sleepstudy$Subject
> summary(lmer(Reaction ~ 1 + Days + (1|Subject),novar))

Random effects:
 Groups   Name        Variance Std.Dev.
 Subject  (Intercept)    0      0.00
 Residual             1847     42.97
Number of obs: 180, groups:  Subject, 18

(That said, there may be valid reasons to drop the additional random
effect, but that's based on particulars of your dataset and
predictive/inferential goals, so only you can make that determination.)

The usual first steps in working on model convergence are scaling your
variables, but that's not an option with categorical variables. You
state that your variables have no trouble with collinearity, but are you
sure? Unfortunately, things like Sex and <X-Scientist> (for various X)
often do correlate.

The next step is try out different optimisers and up the number of
iterations. There is no free lunch and there is no optimiser that is
always best. See

> ?convergence

after loading lme4.

You can also try the worse but potentially good enough Laplace
approximation in glmer with nAGQ=0. There has been some discussion on
the list lately about this option.

Finally, if all that fails, you can try out other mixed-models packages
in R. Both MCMCglmm and brms are excellent, if you're willing to go
Bayesian. (And if you're going Bayesian, then weakly informative
regularizing priors may help a lot with convergence!)

Phillip

On 08/08/2017 08:57 PM, Chad Newbolt wrote:
> 
> 
> So might help out to give more specifics...here is my model with explanations of effects and associated levels
> 
> results=glmer(Status2~Group+Distance+Type+Biologist+Sex+Experience+(1|ID)+(1|Question),data=datum,family=binomial)
> 
> Status2 = Binomial response where 0 is incorrect response and 1 is correct response
> Distance (Outside, Inside) = Image contains an animal  inside or outside a specified distance
> Type (Night, Day) = Image is taken during day or night
> Group (Male, Female, Juvenille) = Image contains an animal that is male, female or juvenille
> Biologist (Yes, No) = Respondent is/is not a Biologist
> Sex (Male Female) = Sex of respondent
> Experience (High, moderate, low, none) = experience looking at images of species of animal in images
> 
> 
> As you can see my fixed effects can be broken down into two broad categories 1) those that categorize the image, 2) those that categorize the respondent...both of which may influence their ability to correctly answer questions.  I made sure during study planning that I have roughly equal numbers of each possible "category" of image represented in the survey.  These were chosen at random from larger pools of each possible image category.  In light of the previous response, since I have fixed effects that categorize the images, or questions, would it still make sense to include (1|Question) or create (1|Category) with n levels to account for variation not associated with my fixed effects?
> 
> For reference, I evaluated VIF of the fixed effects and found little evidence of multicollinearity, and I'm interested in the effects of each of these so I would prefer to keep them in the model in this case.
> 
> 
> Chad Newbolt
> 
> Research Associate
> 
> School of Forestry And Wildlife Sciences
> 
> Auburn University
> 
> 334-332-4864
> 
> ________________________________________
> From: Ewart A C Thomas <ethomas at stanford.edu>
> Sent: Tuesday, August 8, 2017 1:19 PM
> To: Chad Newbolt
> Subject: Re: [R-sig-ME] Question about inclusion of a random effect
> 
> chad, lmer() and its optimisation is a computationally complex undertaking, and one shd always keep the ’size’ of the model in mind.
> 
> you have 94 items.  might you reduce this to ‘categories’ of items (e.g., ‘faces’, ‘people’, ‘houses’, …), such that you have a much smaller number (e.g., 10) of categories.  you wd replace each respondent’s string of 94 responses by a string of 10 categ-responses, and the random effect term wd be (1 | category).  does this make theoretical sense, given the nature of your material?
> 
> also, including x1 thru x6 feels a little like a shopping expedition.  maybe some exploratory anal (factor anal?) wd suggest either (i) using 2 or 3 composites based on x1-x6, or (ii) omitting about 3 of the x’s, because they don’t explain anything.
> 
> you didn’t raise this possibility, but it cd be that some questions/categories are more ‘sensitive’ to x1 than other questions.  in this case, you might try to fit a model with (1 + x1 | category), and see if it fits sig better than the ‘intercept only’ model with (1 | category) - using anova(model1, model2).  good luck!
> ewart
> 
>> On Aug 8, 2017, at 11:06 AM, Chad Newbolt <newboch at auburn.edu> wrote:
>>
>> When I include  (1|Question) I receive the dreaded convergence warning...
>>
>> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>>  Model failed to converge with max|grad| = 0.00303355 (tol = 0.001, component 1)
>>
>> If I remove (1|Question) there is no convergence warning.  Is this an indication that the variance of this random effect is 0 thereby creating a problem with the optimizer?  Does this warrant removing this random effect?  If not, any suggestions on how to proceed with the convergence issues?
>>
>>
>>
>> Chad Newbolt
>>
>> Research Associate
>>
>> School of Forestry And Wildlife Sciences
>>
>> Auburn University
>>
>> 334-332-4864
>>
>> ________________________________________
>> From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on behalf of Chad Newbolt <newboch at auburn.edu>
>> Sent: Tuesday, August 8, 2017 12:50 PM
>> To: r-sig-mixed-models at r-project.org
>> Subject: Re: [R-sig-ME] Question about inclusion of a random effect
>>
>> Thanks to everyone for the clarification and quick responses!!!
>>
>> ________________________________
>> From: Alday, Phillip <Phillip.Alday at mpi.nl>
>> Sent: Tuesday, August 8, 2017 12:43 PM
>> To: Chad Newbolt; r-sig-mixed-models at r-project.org
>> Subject: Re: [R-sig-ME] Question about inclusion of a random effect
>>
>>
>> Yes, it makes sense. This is what is often called an "item" in the discussion on crossed random effects and leaving it out can distort inferences - see Clark 1974 "Language as a fixed effect fallacy" and more recent work by  Westfall and Judd (I'm thinking of their 2012 paper on this, but I can't think of the title or author order and I'm not at my desk to look it up).
>>
>> Phillip
>> ________________________________
>> From: Chad Newbolt <newboch at auburn.edu>
>> Sent: Aug 8, 2017 7:25 PM
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] Question about inclusion of a random effect
>>
>>
>> All,
>>
>>
>>
>> I'm working on analyzing a data set from a survey.  In the survey, I asked a group of respondents to view a series of 94 images, or test questions, and I'm in process of evaluating the influence of various factors on their ability to correctly identify an item in an image.  The test questions likely show a considerable amount of variation in difficulty, with some being harder to correctly answer than others.  I understand that I clearly should include a random effect for each respondent (ID), however, I'm not sure if it is appropriate to include a random effect for question (1|Question) to account for variation.  I may be overthinking this one, but, including and removing (1|Question) dramatically changes my results so I want to make sure to get this one right.
>>
>>
>>
>> My basic model is shown below for reference:
>>
>>
>>
>>  results=glmer(Y~X1+X2+X3+X4+X5+X6+(1|ID)+(1|Question),data=datum,na.action = na.omit,family=binomial)
>>
>>
>>
>> Thanks in advance for the help
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>        [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>