[R-sig-ME] lme() - Error in recalc.corAR1

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Thu Jan 2 17:23:17 CET 2025


   Incomplete solution: scaling x and y seems to help (this is a 
surprisingly common solution for numerical problems, which lme is 
particularly prone to for large  data sets.

   However, I didn't try to fit the full model (500K observations); I 
used a random subsample of 10,000 points. You should try gradually 
increasing the size of the subsample to see how the problem scales 
(i.e., will you have to wait 3 minutes or 3 days for the full answer?) 
and to see if you run into any other problems

library(dplyr) ## data manipulation -- could also be done in base R
library(nlme)
library(glmmTMB)
library(ggplot2); theme_set(theme_bw())

set.seed(101)

dd <- read.csv("r-sig-mixed-02jan.csv")

## add per-user time step indicator, and dummy group (for glmmTMB);
##  scale x and y
dd2 <- (dd
     |> group_by(user_id)
     |> mutate(step = factor(seq(n())))
     |> ungroup()
     |> mutate(xs = drop(scale(x)), ys = drop(scale(y)),
               dummy = factor(1))
)

## subsample
dds <- dd2[sample(nrow(dd2), size = 1e4),]


ggplot(dd, aes(x, y, group = user_id)) + geom_line(alpha = 0.2)


system.time(m <- lme(
     ys ~ xs,
     random = ~xs|user_id,
     correlation = corAR1(form = ~1|user_id),
     weights = varExp(form = ~xs),
     method = 'REML',
     na.action = na.omit,
     data = dds,
     control = lmeControl(
         maxIter = 1000, msMaxIter = 1000,
         msVerbose = TRUE,
         returnObject = TRUE  ## return result even w/ optimizer warning
      )
     )
     )
## 3 seconds

The "singular convergence" warning from nlminb is inscrutable but 
possibly ignorable. 
https://stackoverflow.com/questions/79110546/glmmtmb-convergence-messages


Looking at the output, it seems that you will probably have to simplify 
your model in any case -- there is a correlation of 0.999 between the 
random intercepts and slopes. The estimated AR1 coefficient is also very 
small


## Try full data set. If the problem scales *linearly* (optimistic!)
## then we should expect this to take ~ 150 seconds ...

system.time(m2 <- update(m, data = dd2))


   glmmTMB works too, and gets similar answers, but is much slower in 
this case (6 minutes vs 3 seconds); turning on some parallelization 
*might* speed it up.

system.time(m3 <- glmmTMB(ys ~ xs + (xs|user_id) + ar1(0 + step |user_id),
         dispformula = ~0 + xs,  ## don't want a nugget in corAR1
         REML = TRUE,
         data = dds,
         verbose = TRUE)
         )

fixef(m)
fixef(m3)$cond

VarCorr(m)
VarCorr(m3)


On 2025-01-02 9:08 a.m., Santosh Srinivas wrote:
> Dear List Recipients, Hope this email finds you well.
> 
> I am writing to request your help the below issue:
> Error in recalc.corAR1(object[[i]], conLin): NA/NaN/Inf in foreign function call (arg 1)
> 
> This occurs when I run the following regression model:
> 
> m =
>    lme(
>      y ~ x,
>      random = ~x|user_id,
>      correlation = corAR1(form = ~1|user_id),
>      weights = varExp(form = ~x),
>      method = 'REML',
>      na.action = na.omit,
>      data = df)
> 
> I have uploaded the data here for reproducibility:
> https://drive.google.com/file/d/1kA5-T4zZ-f33gJgAsJ9AWMlK-YXAm65m/view?usp=sharing
> 
> Request your help with how to understand and fix this ussue.
> 
> Thanks & regards
> Santosh
> 
> 
> 
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
 > E-mail is sent at my convenience; I don't expect replies outside of 
working hours.



More information about the R-sig-mixed-models mailing list