[R-sig-ME] Longitudinal logistic regression with continuous-time first-order autocorrelation structure
Thierry Onkelinx
thierry.onkelinx at inbo.be
Tue Mar 6 16:37:57 CET 2018
Dear Dennis,
You converted the time in factors, making it a discrete and regular spaced.
The Ornstein-Uhlenbeck process is the continuous version of 'ar1'. I'm more
familiar with `ar1` and `rw1`, so I didn't look at the Ornstein-Uhlenbeck
process. Note that INLA has an Ornstein-Uhlenbeck process ("ou"), so feel
free to try that as well.
f(Hour, model = "rw1") is the common pattern for all participants. f(HourID,
model = "rw1", replicate = ID) allows for a separate pattern for each
participant but with the same hyperparameters. When f(Hour, model = "rw1")
is the non-linear analogue for a fixed slope Hour, then f(HourID, model =
"rw1", replicate = ID) is then non-linear analogue for the random slope
(Hour|ID).
Several options to include the day effect. A simple one is to assume that
the between day effect is orthogonal to the within day effect. In that case
you simple at Day in some relevant form to the model e.g. Day, poly(Day,
2), f(Day, model = "rw1"), ... Another extreme is to use f(DayTime, model =
"rw1") with DayTime = Day + Hour / 24. This creates a smooth transition
between consecutive days but does not force a diurnal pattern.
Best regards,
ir. Thierry Onkelinx
Statisticus / Statistician
Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be
///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////
<https://www.inbo.be>
2018-03-03 8:44 GMT+01:00 Dennis Ruenger <dennis.ruenger at gmail.com>:
> Thank you, Thierry, for the INLA model.
>
> Can you tell me why you chose a random walk rather than, say, an
> Ornstein-Uhlenbeck process, to model temporal autocorrelation? The INLA
> documentation that I could find online is rather sparse, and I struggle
> with how you model the hierarchical structure, that is f(HourID, model =
> "rw1", replicate = ID). Lastly, can you tell me how you would incorporate
> an effect of Day in the model? I’m expecting that the likelihood of craving
> decreases across days, and that this trend varies between participants.
>
> Dennis
>
> On Mar 2, 2018 at 2:36 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be>
> wrote:
>
>
> Dear Dennis,
>
> Here is an example on how to run such a model with INLA. It took about 3.5
> min on my laptop.
>
> library(INLA)
> library(ggplot2)
> set.seed(20180302)
> n_id <- 35
> n_day <- 7 * 6
> n_prompt <- 5
> dataset <- expand.grid(
> ID = seq_len(n_id),
> Day = seq_len(n_day),
> Prompt = seq_len(n_prompt)
> )
> dataset$Hour <- sample(8:22, size = nrow(dataset), replace = TRUE)
> p0 <- rnorm(n_id, mean = -21, sd = 0.5)
> p1 <- rnorm(n_id, mean = 2.67, sd = 0.05)
> p2 <- rnorm(n_id, mean = -0.08, sd = 0.002)
> dataset$eta <- p2[dataset$ID] * dataset$Hour ^ 2 + p1[dataset$ID] *
> dataset$Hour + p0[dataset$ID]
> dataset$mu <- plogis(dataset$eta)
> ggplot(dataset, aes(x = Hour, y = mu, group = ID)) + geom_line()
> dataset$noise <- rnorm(nrow(dataset), sd = 0.1)
> dataset$prob <- plogis(dataset$eta + dataset$noise)
> dataset$Craving <- rbinom(nrow(dataset), size = 1, prob = dataset$prob)
>
> dataset$HourID <- dataset$Hour
> model <- inla(
> Craving ~
> f(Hour, model = "rw1") +
> f(HourID, model = "rw1", replicate = ID),
> family = "binomial",
> data = dataset
> )
> plot(model)
>
> Best regards,
>
> Thierry
>
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx at inbo.be
> Havenlaan 88
> <https://maps.google.com/?q=Havenlaan+88&entry=gmail&source=g> bus 73,
> 1000 Brussel
> www.inbo.be
>
> ////////////////////////////////////////////////////////////
> ///////////////////////////////
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> ////////////////////////////////////////////////////////////
> ///////////////////////////////
>
> <https://www.inbo.be>
>
> 2018-03-01 2:44 GMT+01:00 Dennis Ruenger <dennis.ruenger at gmail.com>:
>
>> Thanks, Alain and Ben, for your replies.
>>
>> My understanding is that for the kind of intensive longitudinal data I'm
>> dealing with, a mixed model with both random intercepts and slopes for the
>> time effect *and *autoregressive errors are recommended.
>>
>> I'd like to follow Alain's suggestion and give glmmTMB a try. Based on a
>> description of the covariance structures available with glmmTMB (link
>> below), it looks like the Ornstein–Uhlenbeck covariance structure might be
>> what I'm looking for (i.e., something akin to corrCAR1() that works in a
>> GLMM).
>>
>> So I tried:
>>
>> df$time_hours <- numFactor(df$time_hours)
>> fit <- glmmTMB(y ~ time_hours + (time_hours|id) + ou(time_hours-1|id),
>> family = binomial, data = df)
>>
>> However, after about 10 minutes, I receive an error message about failed
>> memory allocation (on a laptop with a 7th gen Intel Core i5 processor and
>> 8GB RAM). The data set includes 34 participants with up to 300 data points
>> per participants. Running the model for a subset of 5 participants also
>> resulted in memory allocation failure. The same was true for the spatial
>> Gaussian and spatial exponential covariance structures.
>>
>> Does anyone see a way to make this work with glmmTMB?
>>
>> Thanks a lot.
>>
>> https://cran.r-project.org/web/packages/glmmTMB/vignettes/covstruct.html
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list