[R-sig-ME] linear mixed model with non-monotonic longitudinal data

Thu Jan 3 16:37:35 CET 2013

Hi Ben et al.:

Thanks for the great suggestions.  I tried the models you suggested (e.g. ~genotype*time+(1|subject_id)).  I think it's clear that the dataset I'm trying to fit is not well-described by any of these models.  At least one of the models I tested is probably very similar to how SPSS implements it, so it's probably a safe bet that I'm using the wrong approach.  

We set genotype as a fixed effect because it appears (based on some literature I read) that it is the appropriate approach (i.e. we are interested comparing the two genotypes tested).

Thanks again for all your help,
Brad

________________________________________
From: Ben Bolker [bbolker at gmail.com]
Sent: Friday, December 28, 2012 12:22 PM
To: Brad Buran
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] linear mixed model with non-monotonic longitudinal data

On 12-12-27 05:31 PM, Brad Buran wrote:

> That was very helpful.  It didn't occur to me that it was possible
  to use functions such as poly in the equation.  That said, it seems
  that none of the models (i.e. the ones I mentioned nor the ones you
  suggested) seem to be a good fit for the data based on the
  residuals.  Since the original model was defined in SPSS (using the
  genlinmixed function), I want to try to determine the actual model
  that SPSS uses (they make you step through a GUI to define your
  model rather than defining an equation like R does).  I haven't been
  able to find any documentation on how the inputs to the genlinmixed
  command are transformed into a model that I could try to create in R
  (so I can check it's validity).

> Is anyone aware of how SPSS model GENLINMIXED with
  SUBJECTS=subject_id, REPEATED_MEASURES=time, FIXED_EFFECTS=genotype
  time genotype*time would translate to a R formula?

   I *think* this would be something like

 ~ genotype+time+genotype:time + (1|subject_id)

or equivalently

 ~ genotype*time + (1|subject_id)

  I assume that (1) there are multiple subjects per genotype
(otherwise subject_id and genotype would be confounded) and
perhaps only one sample per subject -- otherwise it would make
more sense to use (time|subject_id) to allow a time-by-subject
interaction.

  In the R specification it seems one doesn't need both
SUBJECTS and REPEATED_MEASURES (although maybe SPSS does
something else with the REPEATED_MEASURES specification, such
as allowing an R-side autoregressive model to be specified?)
Or else I'm missing something (quite likely).

  I'm also a little surprised that genotype is a fixed effect,
unless the sample size is small (or SPSS doesn't allow multiple,
nested random effects ...)  I would think a model like

 ~ time + (time|genotype/subject_id)

would be best in general?

  Re: time signal -- did you try a GAM? It's a little hard to
see how that could fail to fit a reasonably smooth signal, unless
the shape was really weird ... I can appreciate that quadratics
wouldn't do the job (there are more exotic options like Ricker,
power = a*time*exp(-b*time), which can be done if you are allowed
a logarithmic link and an offset:

  log(power) ~ offset(log(time)) + time

> Thanks!
> Brad
> ________________________________________
> From: r-sig-mixed-models-bounces at r-project.org [r-sig-mixed-models-bounces at r-project.org] on behalf of Ben Bolker [bbolker at gmail.com]
> Sent: Thursday, December 27, 2012 10:49 AM
> To: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] linear mixed model with non-monotonic longitudinal      data
>
> Brad Buran <bburan at ...> writes:
>
>>  I'm attempting to fit a linear mixed model to my dataset.  This
>> data is the measure of stimulus-evoked power as a function of time.
>> We have 32 subjects from two populations (broken down by genotype).
>> The stimulus-evoked power is sampled at a high rate (one data-point
>> every 5 msec) and reflects the "longitudinal" or "within-subjects"
>> measure in my study.
>
>> Right now I've defined the model as:
>>
>> power ~ genotype * time + (1|subject_id)
>>
>> I understand that one must also test additional models such as:
>>
>> power ~ genotype * time + (time|subject_id)
>> power ~ genotype * time + (1|subject_id) + (0+time|subject_id)
>
>> However, power is not a linear function of time (i.e. it is
>> non-monotonic).  Power rapidly increases over a few hundred
>> milliseconds to a peak value then gradually declines afterwards.  In
>> this situation, would it be inappropriate to use time for
>> determining a slope for the random effect?
>
>> I'm actually not even sure whether a linear mixed model is
>> appropriate for this type of data (considering the power response is
>> non-monotonic with respect to time).  However, this is how the
>> original analysis was set up by a predecessor and I am currently
>> trying to determine the validity of this approach.  Thanks, Brad
>
>   Hard to answer completely in general.  The simplest approach
> would probably be to make the response a quadratic function of
> time; there are a few slightly complicating issues (whether to
> use a boneheaded approach such as (genotype * (time + I(time^2))) or
> to use poly(time,2) , which constructs orthogonal polynomials
> by default, and how to get the time*subject interactions specified
> correctly), but it's pretty easy and if it looks like it fits
> your data well I might be satisfied with it.
>   You could also fit generalized additive mixed
> models (see the mgcv and gamm4 packages), again I'm not 100%
> sure how to incorporate the time*subject interactions.
>
>   The bottom line is that linear models are actually pretty
> flexible for modeling continuous, not necessarily linear,
> responses (the assumption is that the model is a linear function
> of the parameters, not necessarily that (e.g.) power is
> a linear function of time).
>
>   Ben Bolker
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>