[R-sig-ME] Related fixed and random factors and planned comparisons in a 2x2 design

Mon Jun 6 21:57:00 CEST 2016

Dear Tom,

Thank you so much for these detailed replies and I appreciate your help!

Sincerely,

Paul

2016-06-06 21:51 GMT+02:00 Houslay, Tom <T.Houslay at exeter.ac.uk>:

> Hi Paul,
>
>
> I think you're right here in that actually you don't want to nest channel
> inside participant (which led to that error message - sorry, should have
> seen that coming!).
>
>
> It's hard to know without seeing data plotted, but my guess from your
> email is that you probably see some clustering both at individual level and
> at channel level? Perhaps separate random effects, ie (1|Participant) +
> (1|Channel), is the way to go (and then you shouldn't have the problem as
> regards number of observations - instead you'll have an intercept deviation
> for each of your N individuals, and also intercept deviations for each of
> your 9 channels). You certainly want to keep the participant intercept in
> though, as each individual gets both items (right?), so you need to model
> that association. You can use your variance components output from lmer to
> determine what proportion of the phenotypic variance (conditional on your
> fixed effects) is explained by each of these components, eg
> V(individual)/(V(individual) + V(channel) + V(residual) would give you the
> proportion explained by differences among individuals in their voltage. It
> would be cool to know if differences among individuals, or among
> channels, is driving the variation that you find. I think using the sjplot
> function for lmer would be useful to look at the levels of your random
> effects:
>
>
> http://strengejacke.de/sjPlot/sjp.lmer/
>
>
> As for 'contrasts', again I haven't used that particular package, but from
> a brief glance it looks like you're on the right track - binary coding is
> the 'simple coding' as set out here:
>
>
> http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm
>
>
> Good luck!
>
>
> Tom
>
>
> ------------------------------
> *From:* paul <graftedlife at gmail.com>
> *Sent:* 06 June 2016 20:06:02
> *To:* Houslay, Tom
> *Cc:* r-sig-mixed-models at r-project.org
> *Subject:* Re: Related fixed and random factors and planned comparisons
> in a 2x2 design
>
> Dear Tom,
>
> Many thanks for these very helpful comments and suggestions! Would you
> just allow me to ask some further questions:
>
> 1. I've been considering whether to cross or to nest the random effects
> for quite a while. Data from the same channel across participants do show
> corresponding trends (thus a bit different from the case when, e.g.,
> sampling nine neurons from the same individual). Would nesting channel
> within participant deal with that relationship?
>
> 2. I actually also tried nesting channel within participant. However, when
> I proceeded to run planned comparisons (I guess I'd better have them done
> because of their theoretical significance) based on this mixed-effect
> modeling approach (as illustrated in the earlier mail but with the random
> factor as (1|participant/channel), to maintain consistency of analytical
> methods), R gave me an error message:
>
> Error: number of levels of each grouping factor must be < number of observations
>
>
> I think this is because in my data, each participant only contributes one
> data point per channel and thus the data points are not enough. I guess
> that probably means I can't go on in this direction to run the planned
> comparisons... (?) I'm not pretty sure how contrasts based on binary dummy
> variables may be done and will try to further explore that. But before I
> establish the mixed model I already set up orthogonal contrasts for group
> and item in the dataset using the function contrasts(). Does this have
> anything to do with what you meant?
>
> 3. I worried about pseudoreplicability when participant ID is not
> included. Concerning this point, later it came to me that
> pseudoreplicability usually occurred in cases when multiple responses from
> the same individual are grouped in the same cell, rendering the data within
> the same cell non-independent (similar to the case of repeated-measure
> ANOVA? sorry if I got a wrong understanding...). But as mentioned earlier
> in my data, each participant only contributes one data point per channel,
> when channel alone is already modeled as a random factor, would that mean
> all data points within a cell all come from different participants and thus
> in this case may deal with the independence assumption? (Again I'm sorry if
> my concept is wrong and would appreciate instructions on this point...)
>
> Many, many thanks!
>
> Paul
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> 2016-06-06 19:10 GMT+02:00 Houslay, Tom <T.Houslay at exeter.ac.uk>:
>
>> Hi Paul,
>>
>> I don't think anyone's responded to this yet, but my main point would be
>> that you should check out Schielzeth & Nakagawa's 2012 paper 'Nested by
>> design' (
>> http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210x.2012.00251.x/abstract
>> ) for a nice rundown on structuring your model for this type of data.
>>
>> It may also be worth thinking about how random intercepts work in a
>> visual sense; there are a variety of tools that help you do this from a
>> model (packages sjplot, visreg, broom), or you can just plot different
>> levels yourself (eg consider plotting the means for AP, AQ, BP, BQ; the
>> same with mean values from each individual overplotted around these group
>> means; and even the group means with all points shown, perhaps coloured by
>> individual - ggplot is really useful for getting this type of figure
>> together quickly).
>>
>> As to some of your other questions:
>>
>> 1) You need to keep participant ID in. I'm not 100% on your data
>> structure from the question, but you certainly seem to have repeated
>> measures for individuals (I'm assuming that groups A and B each contain
>> multiple individuals, none of whom were in both groups, and each of which
>> were shown both objects P and Q, in a random order). It's not surprising
>> that the effects of group are weakened if you remove participant ID,
>> because you're then effectively entering pseudoreplication into your model
>> (ie, telling your model that all the data points within a group are
>> independent, when that isn't the case).
>>
>> 2) I think channel should be nested within individual, with a model
>> something like model <- lmer(voltage ~ group * item +
>> (1|participant/channel), data = ...)
>>
>> 3) This really depends on what your interest is. If you simply want to
>> show that there is an overall interaction effect, then your p-value from a
>> likelihood ratio test of the model with/without the interaction term gives
>> significance of this interaction, and then a plot of predicted values for
>> the fixed effects (w/ data overplotted if possible) should show the trends.
>> You could also use binary dummy variables to make more explicit contrasts,
>> but it's worth reading up on these a bit more. I don't really use these
>> type of comparisons very much, so I can't comment further I'm afraid.
>>
>> 4) Your item is like treatment in this case - you appear to be more
>> interested in the effect of different items (rather than how much variation
>> 'item' explains), so keep this as a fixed effect and not as random.
>>
>> Hope some of this is useful,
>>
>> Tom
>>
>>
>> ________________________________________
>>
>>
>> Message: 1
>> Date: Fri, 3 Jun 2016 14:28:59 +0200
>> From: paul <graftedlife at gmail.com>
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] Related fixed and random factors and planned
>>         comparisons     in a 2x2 design
>> Message-ID:
>>         <
>> CALS4JYfoTbhwhy8S0kHePuw9pPv-NTkrsLrB2Z2YO5ks5gnnOA at mail.gmail.com>
>> Content-Type: text/plain; charset="UTF-8"
>>
>> Dear All,
>>
>> I am trying to use mixed-effect modeling to analyze brain wave data from
>> two groups of participants when they were presented with two distinct
>> stimulus. The data points (scalp voltage) were gathered from the same set
>> of 9 nearby channels from each participant. And so I have the following
>> factors:
>>
>>    - voltage: the dependent variable
>>    - group: the between-participant/within-item variable for groups A and
>> B
>>    - item: the within-participant variable (note there are exactly only 2
>>    items, P and Q)
>>    - participant: identifying each participant across the two groups
>>    - channel: identifying each channel (note that data from these channels
>>    in a nearby region tend to display similar, thus correlated, patterns
>> in
>>    the same participant)
>>
>> The hypothesis is that only group B will show difference between P and Q
>> (i.e., there should be an interaction effect). So I established a
>> mixed-effect model using the lme4 package in R:
>>
>> model <-
>> lmer(voltage~1+group+item+(group:item)+(1|participant)+(1|channel),
>>               data=data, REML=FALSE)
>>
>> Questions:
>>
>>    1.
>>
>>    I'm not sure if it is reasonable to add in participant as a random
>>    effect, because it is related to group and seems to weaken the effects
>> of
>>    group. Would it be all right if I don't add it in?
>>    2.
>>
>>    Because the data from nearby channels of the same participant tend to
>> be
>>    correlated, I'm not sure if modeling participant and channel as crossed
>>    random effects is all right. But meanwhile it seems also strange if I
>> treat
>>    channel as nested within participant, because they are the same set of
>>    channels across participants.
>>    3.
>>
>>    The interaction term is significant. But how should planned comparisons
>>    be done (e.g., differences between groups A and B for P) or is it even
>>    necessary to run planned comparisons? I saw suggestions for t-tests,
>>    lsmeans, glht, or for more complicated methods such as breaking down
>> the
>>    model and subsetting the data:
>>
>>    data[, P_True:=(item=="P")]
>>    posthoc<-lmer(voltage~1+group
>>        +(1|participant)+1|channel)
>>        , data=data[item=="P"]
>>        , subset=data$P_True
>>        , REML=FALSE)
>>
>>    But especially here comparing only between two groups while modeling
>>    participant as a random effect seems detrimental to the group effects.
>> And
>>    I'm not sure if it is really OK to do so. On the other hand, because
>> the
>>    data still contain non-independent data points (from nearby channels),
>> I'm
>>    not sure if simply using t-tests is all right. Will non-parametric
>> tests
>>    (e.g., Wilcoxon tests) do in such cases?
>>    4.
>>
>>    I suppose I don't need to model item as a random effect because there
>>    are only two of them, one for each level, right?
>>
>> I would really appreciate your help!!
>>
>> Best regards,
>>
>> Paul
>>
>>         [[alternative HTML version deleted]]
>>
>>
>>
>

	[[alternative HTML version deleted]]