[R-sig-ME] linear mixed model with non-monotonic longitudinal data
Brad Buran
bburan at galenea.com
Thu Jan 3 16:37:35 CET 2013
Hi Ben et al.:
Thanks for the great suggestions. I tried the models you suggested (e.g. ~genotype*time+(1|subject_id)). I think it's clear that the dataset I'm trying to fit is not well-described by any of these models. At least one of the models I tested is probably very similar to how SPSS implements it, so it's probably a safe bet that I'm using the wrong approach.
We set genotype as a fixed effect because it appears (based on some literature I read) that it is the appropriate approach (i.e. we are interested comparing the two genotypes tested).
Thanks again for all your help,
Brad
________________________________________
From: Ben Bolker [bbolker at gmail.com]
Sent: Friday, December 28, 2012 12:22 PM
To: Brad Buran
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] linear mixed model with non-monotonic longitudinal data
On 12-12-27 05:31 PM, Brad Buran wrote:
> That was very helpful. It didn't occur to me that it was possible
to use functions such as poly in the equation. That said, it seems
that none of the models (i.e. the ones I mentioned nor the ones you
suggested) seem to be a good fit for the data based on the
residuals. Since the original model was defined in SPSS (using the
genlinmixed function), I want to try to determine the actual model
that SPSS uses (they make you step through a GUI to define your
model rather than defining an equation like R does). I haven't been
able to find any documentation on how the inputs to the genlinmixed
command are transformed into a model that I could try to create in R
(so I can check it's validity).
> Is anyone aware of how SPSS model GENLINMIXED with
SUBJECTS=subject_id, REPEATED_MEASURES=time, FIXED_EFFECTS=genotype
time genotype*time would translate to a R formula?
I *think* this would be something like
~ genotype+time+genotype:time + (1|subject_id)
or equivalently
~ genotype*time + (1|subject_id)
I assume that (1) there are multiple subjects per genotype
(otherwise subject_id and genotype would be confounded) and
perhaps only one sample per subject -- otherwise it would make
more sense to use (time|subject_id) to allow a time-by-subject
interaction.
In the R specification it seems one doesn't need both
SUBJECTS and REPEATED_MEASURES (although maybe SPSS does
something else with the REPEATED_MEASURES specification, such
as allowing an R-side autoregressive model to be specified?)
Or else I'm missing something (quite likely).
I'm also a little surprised that genotype is a fixed effect,
unless the sample size is small (or SPSS doesn't allow multiple,
nested random effects ...) I would think a model like
~ time + (time|genotype/subject_id)
would be best in general?
Re: time signal -- did you try a GAM? It's a little hard to
see how that could fail to fit a reasonably smooth signal, unless
the shape was really weird ... I can appreciate that quadratics
wouldn't do the job (there are more exotic options like Ricker,
power = a*time*exp(-b*time), which can be done if you are allowed
a logarithmic link and an offset:
log(power) ~ offset(log(time)) + time
> Thanks!
> Brad
> ________________________________________
> From: r-sig-mixed-models-bounces at r-project.org [r-sig-mixed-models-bounces at r-project.org] on behalf of Ben Bolker [bbolker at gmail.com]
> Sent: Thursday, December 27, 2012 10:49 AM
> To: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] linear mixed model with non-monotonic longitudinal data
>
> Brad Buran <bburan at ...> writes:
>
>> I'm attempting to fit a linear mixed model to my dataset. This
>> data is the measure of stimulus-evoked power as a function of time.
>> We have 32 subjects from two populations (broken down by genotype).
>> The stimulus-evoked power is sampled at a high rate (one data-point
>> every 5 msec) and reflects the "longitudinal" or "within-subjects"
>> measure in my study.
>
>> Right now I've defined the model as:
>>
>> power ~ genotype * time + (1|subject_id)
>>
>> I understand that one must also test additional models such as:
>>
>> power ~ genotype * time + (time|subject_id)
>> power ~ genotype * time + (1|subject_id) + (0+time|subject_id)
>
>> However, power is not a linear function of time (i.e. it is
>> non-monotonic). Power rapidly increases over a few hundred
>> milliseconds to a peak value then gradually declines afterwards. In
>> this situation, would it be inappropriate to use time for
>> determining a slope for the random effect?
>
>> I'm actually not even sure whether a linear mixed model is
>> appropriate for this type of data (considering the power response is
>> non-monotonic with respect to time). However, this is how the
>> original analysis was set up by a predecessor and I am currently
>> trying to determine the validity of this approach. Thanks, Brad
>
> Hard to answer completely in general. The simplest approach
> would probably be to make the response a quadratic function of
> time; there are a few slightly complicating issues (whether to
> use a boneheaded approach such as (genotype * (time + I(time^2))) or
> to use poly(time,2) , which constructs orthogonal polynomials
> by default, and how to get the time*subject interactions specified
> correctly), but it's pretty easy and if it looks like it fits
> your data well I might be satisfied with it.
> You could also fit generalized additive mixed
> models (see the mgcv and gamm4 packages), again I'm not 100%
> sure how to incorporate the time*subject interactions.
>
> The bottom line is that linear models are actually pretty
> flexible for modeling continuous, not necessarily linear,
> responses (the assumption is that the model is a linear function
> of the parameters, not necessarily that (e.g.) power is
> a linear function of time).
>
> Ben Bolker
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list