[R-sig-ME] Specifying models nested crossed random effects
Joshua Rosenberg
jmichaelrosenberg at gmail.com
Tue Apr 25 16:28:08 CEST 2017
Evan - thank you very much for your advice, I've basically specified the
model as you suggested and it seems to be a reasonable approach.
thanks again,
Josh
On Sun, Apr 9, 2017 at 2:56 PM, Evan Palmer-Young <ecp52 at cornell.edu> wrote:
> Thanks for those details, Josh. Interesting design!
>
> I'm not experienced in interpreting random effects on their own, so others
> will have better advice on that.
>
> For your model structure, it sounds like there are three random effects:
>
> "program_ID"
> "participant_ID"
> "sample_ID"
>
> From my reading of lme4 documentation, I think that you have coded
> sample_ID correctly and do not need to explicitly nest it within program_ID.
>
> In general, think it may be better form to include both fixed and random
> predictors in your model, rather than having separate models to assess only
> the random effects.
>
> So your model might be something like,
>
> interest_model <- lmer(interest ~ ?Instruction_type? + ?time_of_day? +
> ?Working_alone? + (1}program_ID) + (1|participant_ID) + (1|sample_ID),
> data = df)
>
> Where Instruction_type, time_of_day , Working_alone, are fabricated
> variables that might resemble variables you recorded.
>
> As a disclaimer, this is my second time answering to the list-- welcome!
>
> Best wishes, Evan
>
>
>
>
>
> On Sat, Apr 8, 2017 at 4:26 PM, Joshua Rosenberg <
> jmichaelrosenberg at gmail.com> wrote:
>
>> Thank you Evan for your response and thank you for clarifying.
>>
>> Responses are in-line below.
>>
>>
>> Thank you for considering this!
>>
>> Josh
>>
>>
>> On Sat, Apr 8, 2017 at 3:28 PM, Evan Palmer-Young <ecp52 at cornell.edu>
>> wrote:
>>
>>> Josh,
>>> Thanks for the questions.
>>> Can you provide a little bit more description about the variables?
>>>
>>
>> First, sorry, I had changed some of the variable names in the data and
>> realize I used different names (and a different outcome) in the examples at
>> the bottom.
>>
>> "interest" (one outcome we're measuring) is a variable of participants'
>> self-reported interest using a 1-4 scale.
>>
>> "overall_engagement" is one other (different) outcome: One that was a
>> composite of variables of students' interest, how hard they were
>> concentrating,
>> and how challenging they reported what they were learning was.
>>
>> We asked participants (youth) about how interested they were in what they
>> were learning at random intervals using what is called an experience
>> sampling method. In our method, youth had phones on which they were asked
>> about what they were thinking / feeling - every youth in the same program
>> (more on the programs in just a moment) was notified to answer our
>> questions at the same time, although both the instance in time and the
>> interval between these questions was different between programs.
>>
>> "site" = "program" (ID) and program is an indicator for membership in one
>> of the 10 programs.
>>
>> Because youth were repeatedly sampled, "participant_ID" is an indicator
>> for one of about 200 participants.
>>
>> "sample_ID" is an indicator unique for each program (it was made from the
>> program_ID, the date, and which of one of four samples it was for that
>> date). There are about 20 unique values for it for each program, from
>> around 200 values total.
>>
>>
>>> Does "site" = "program"?
>>> Are participants queried at multiple timepoints? If pre- and
>>> post-program, could this be included as a factor with levels "before" and
>>> "afte
>>>
>>
>> Yes, the sampling consisted of repeated measures within participant
>> (around 15-20 responses per participant). It's a bit tricky for me to
>> describe, but as I mentioned above every youth in the same program was
>> notified to answer questions at the same time, though both the instance in
>> time and the interval between these questions differed between the 10
>> programs.
>>
>>
>>> Do you have any particular hypotheses or questions you want to answer
>>> with your model?
>>>
>>
>> We're interested in, for a lack of a better word, time point or
>> situation-specific ("sample_ID") variables' relationships with engagement.
>> We coded video of the programs, including before and when youth were
>> notified to respond, for example, the type of activity youth were
>> participating in (i.e., working in groups or individually; doing hands-on
>> activities or listening to the activity leaders). We imagine considering
>> these as categorical variables.
>>
>> Similarly, we're interested in relationships between youth's
>> characteristics (such as pre-program interest and demographic
>> characteristics, such as gender) and our outcomes and to a bit of a lesser
>> extent relationships between some program factors and outcomes (though with
>> only 10 programs, we do not imagine we will have statistical power to
>> detect any / many effects at that level).
>>
>> We're interested in sources of variance as a substantive question (how
>> much of students' engagement is explained by time-point ("sample_ID"),
>> youth ("participant_ID"), and program ("program_ID") effects?). Though this
>> is a bit secondary to our questions about the specific variables at
>> time-point, youth, and program levels.
>>
>>
>>> Best wishes, Evan
>>>
>>
>>
>>
>>
>> --
>> Joshua Rosenberg
>> jmichaelrosenberg at gmail.com
>> http://joshuamrosenberg.com
>>
>
>
>
> --
> Evan Palmer-Young
> PhD candidate
> Department of Biology
> 221 Morrill Science Center
> 611 North Pleasant St
> Amherst MA 01003
> https://sites.google.com/a/cornell.edu/evan-palmer-young/
> epalmery at cns.umass.edu
> ecp52 at cornell.edu
>
--
Joshua Rosenberg
jmichaelrosenberg at gmail.com
http://joshuamrosenberg.com
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list