[R-sig-ME] Question regarding large data.frame in LMER?

Vinicius Maia v|n|c|u@@@@m@|@77 @end|ng |rom gm@||@com
Fri Dec 11 03:33:20 CET 2020


I agree with the comments above about scale and centering the continuous
predictors and use poly instead of ^2.

How many levels do you have in country_year? It seems you have only two
levels (1 and 2) in this variable.
If you have only two levels in country_year it is not a good idea to treat
this variable as random, you need more levels to estimate random slopes and
intercepts.
If it is your case, treating country_year as fixed may solve your problem.

Best,

Vinícius

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Livre
de vírus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>.
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Em qui., 10 de dez. de 2020 às 19:13, João Veríssimo <jl.verissimo using gmail.com>
escreveu:

> Not sure if these are solutions, but I'd try:
>
> a) centering/scaling Age
>
> and/or
>
> b) using poly(Age, 2), rather than I(age^2)
> (i.e., an orthogonal polynomial)
>
> Maybe related to "badly scaled parameters"?
> https://github.com/lme4/lme4/issues/173
>
> João
>
> On Thu, 2020-12-10 at 12:11 +0000, Jad Moawad wrote:
> > I am working with a large data.frame that contains around 1.4 million
> > observations. Initially when i was running my models, i was working
> > on a sub-sample (10% of my full-sample). This is because running one
> > model can take a lot of time using the original data. Once i was sure
> > that all variables are well harmonized and all regressions were
> > running fine, i ran my models using the full sample. However, the
> > regression did not converge and i received the following two errors
> > from two different models:
> >
> > Error in fun(xaa, ...) : Downdated VtV is not positive definite
> >
> > Error in fun(xss, ...) : Downdated VtV is not positive definite
> >
> > I use the lmer function to fit my model and i include a random slopes
> > at the country and country_year level. Below you find the code that i
> > use.
> >
> > Model1 <- lmer(health~ class + age + I(age^2)  +
> > class*macro_unemployment +
> >                (class + age + I(age^2)|country) +
> >                (class+ age + I(age^2) |country_year) +
> >                (1|id), data=df)
> >
> > Model2 <- lmer(health~ education + age + I(age^2)  +
> > education*macro_unemployment+
> >                (education + age + I(age^2)|country) +
> >                (education + age + I(age^2) |country_year) +
> >                (1|id), data=df)
> >
> >
> > Could someone help me please with solving this issue?
> >
> > Below you find a glimpse (str) of my data and my sessionInfo():
> >
> > tibble [1,370,264  8] (S3: grouped_df/tbl_df/tbl/data.frame)
> >  $ health            : num [1:1370264] 100 100 50 100 0 75 75 100 100
> > 50 ...
> >  $ class             : Factor w/ 3 levels "Upper-middle class",..: 3
> > 3 NA 3 3 3 3 1 1 3 ...
> >  $ education         : Factor w/ 3 levels "low","mid","high": 1 1 1 1
> > 1 1 2 3 3 1 ...
> >  $ age               : num [1:1370264] 24 25 24 25 42 43 34 34 35 58
> > ...
> >  $ macro_unemployment: num [1:1370264] 5.24 4.86 5.24 4.86 5.24 ...
> >  $ id                : int [1:1370264] 2 2 3 3 4 4 6 7 7 8 ...
> >  $ country_year      : int [1:1370264] 1 2 1 2 1 2 1 1 2 1 ...
> >  $ country           : Factor w/ 30 levels "Austria","Belgium",..: 1
> > 1 1 1 1 1 1 1 1 1 ...
> >  - attr(*, "groups")= tibble [27  2] (S3: tbl_df/tbl/data.frame)
> >   ..$ country: Factor w/ 30 levels "Austria","Belgium",..: 1 2 3 6 7
> > 8 9 10 11 12 ...
> >   ..$ .rows  : list<int> [1:27]
> >   .. ..$ : int [1:47204] 1 2 3 4 5 6 7 8 9 10 ...
> >   .. ..$ : int [1:41361] 47205 47206 47207 47208 47209 47210 47211
> > 47212 47213 47214 ...
> >   .. ..$ : int [1:42407] 88566 88567 88568 88569 88570 88571 88572
> > 88573 88574 88575 ...
> >   .. ..$ : int [1:48253] 130973 130974 130975 130976 130977 130978
> > 130979 130980 130981 130982 ...
> >   .. ..$ : int [1:31917] 179226 179227 179228 179229 179230 179231
> > 179232 179233 179234 179235 ...
> >   .. ..$ : int [1:44047] 211143 211144 211145 211146 211147 211148
> > 211149 211150 211151 211152 ...
> >   .. ..$ : int [1:62087] 255190 255191 255192 255193 255194 255195
> > 255196 255197 255198 255199 ...
> >   .. ..$ : int [1:94309] 317277 317278 317279 317280 317281 317282
> > 317283 317284 317285 317286 ...
> >   .. ..$ : int [1:37246] 411586 411587 411588 411589 411590 411591
> > 411592 411593 411594 411595 ...
> >   .. ..$ : int [1:77253] 448832 448833 448834 448835 448836 448837
> > 448838 448839 448840 448841 ...
> >   .. ..$ : int [1:16823] 526085 526086 526087 526088 526089 526090
> > 526091 526092 526093 526094 ...
> >   .. ..$ : int [1:24687] 542908 542909 542910 542911 542912 542913
> > 542914 542915 542916 542917 ...
> >   .. ..$ : int [1:116263] 567595 567596 567597 567598 567599 567600
> > 567601 567602 567603 567604 ...
> >   .. ..$ : int [1:43218] 683858 683859 683860 683861 683862 683863
> > 683864 683865 683866 683867 ...
> >   .. ..$ : int [1:28709] 727076 727077 727078 727079 727080 727081
> > 727082 727083 727084 727085 ...
> >   .. ..$ : int [1:27583] 755785 755786 755787 755788 755789 755790
> > 755791 755792 755793 755794 ...
> >   .. ..$ : int [1:77960] 783368 783369 783370 783371 783372 783373
> > 783374 783375 783376 783377 ...
> >   .. ..$ : int [1:36922] 861328 861329 861330 861331 861332 861333
> > 861334 861335 861336 861337 ...
> >   .. ..$ : int [1:93194] 898250 898251 898252 898253 898254 898255
> > 898256 898257 898258 898259 ...
> >   .. ..$ : int [1:9004] 991444 991445 991446 991447 991448 991449
> > 991450 991451 991452 991453 ...
> >   .. ..$ : int [1:40074] 1000448 1000449 1000450 1000451 1000452
> > 1000453 1000454 1000455 1000456 1000457 ...
> >   .. ..$ : int [1:29342] 1040522 1040523 1040524 1040525 1040526
> > 1040527 1040528 1040529 1040530 1040531 ...
> >   .. ..$ : int [1:85124] 1069864 1069865 1069866 1069867 1069868
> > 1069869 1069870 1069871 1069872 1069873 ...
> >   .. ..$ : int [1:92350] 1154988 1154989 1154990 1154991 1154992
> > 1154993 1154994 1154995 1154996 1154997 ...
> >   .. ..$ : int [1:50188] 1247338 1247339 1247340 1247341 1247342
> > 1247343 1247344 1247345 1247346 1247347 ...
> >   .. ..$ : int [1:7598] 1297526 1297527 1297528 1297529 1297530
> > 1297531 1297532 1297533 1297534 1297535 ...
> >   .. ..$ : int [1:65141] 1305124 1305125 1305126 1305127 1305128
> > 1305129 1305130 1305131 1305132 1305133 ...
> >   .. ..@ ptype: int(0)
> >   ..- attr(*, ".drop")= logi TRUE
> > >
> >
> >
> > Session Info:
> >
> > R version 4.0.2 (2020-06-22)
> > Platform: x86_64-apple-darwin17.0 (64-bit)
> > Running under: macOS Catalina 10.15.6
> >
> > Matrix products: default
> > BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Fr
> > ameworks/vecLib.framework/Versions/A/libBLAS.dylib
> > LAPACK:
> > /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack
> > .dylib
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats     graphics  grDevices
> > [4] utils     datasets  methods
> > [7] base
> >
> > other attached packages:
> >  [1] sessioninfo_1.1.1
> >  [2] sjlabelled_1.1.5
> >  [3] varhandle_2.0.5
> >  [4] labelled_2.7.0
> >  [5] dplyr_1.0.0
> >  [6] ggplot2_3.3.2
> >  [7] forcats_0.5.0
> >  [8] reprex_0.3.0
> >  [9] lmerTest_3.1-3
> > [10] lme4_1.1-25
> > [11] Matrix_1.2-18
> >
> > loaded via a namespace (and not attached):
> >  [1] Rcpp_1.0.4.6
> >  [2] compiler_4.0.2
> >  [3] pillar_1.4.4
> >  [4] nloptr_1.2.2.1
> >  [5] tools_4.0.2
> >  [6] digest_0.6.25
> >  [7] boot_1.3-25
> >  [8] statmod_1.4.34
> >  [9] lifecycle_0.2.0
> > [10] tibble_3.0.1
> > [11] nlme_3.1-148
> > [12] gtable_0.3.0
> > [13] lattice_0.20-41
> > [14] pkgconfig_2.0.3
> > [15] rlang_0.4.7
> > [16] cli_2.0.2
> > [17] rstudioapi_0.11
> > [18] haven_2.3.1
> > [19] withr_2.2.0
> > [20] hms_0.5.3
> > [21] generics_0.0.2
> > [22] vctrs_0.3.1
> > [23] fs_1.4.1
> > [24] grid_4.0.2
> > [25] tidyselect_1.1.0
> > [26] glue_1.4.1
> > [27] R6_2.4.1
> > [28] fansi_0.4.1
> > [29] minqa_1.2.4
> > [30] farver_2.0.3
> > [31] purrr_0.3.4
> > [32] magrittr_1.5
> > [33] scales_1.1.1
> > [34] ellipsis_0.3.1
> > [35] MASS_7.3-51.6
> > [36] splines_4.0.2
> > [37] insight_0.11.0
> > [38] assertthat_0.2.1
> > [39] colorspace_1.4-1
> > [40] numDeriv_2016.8-1.1
> > [41] labeling_0.3
> > [42] utf8_1.1.4
> > [43] munsell_0.5.0
> > [44] crayon_1.3.4
> >
> >
> >
> >
> > Sincerely,
> >
> >
> >
> > Jad Moawad
> >
> >
> > PhD candidate and teaching assistant
> > University of Lausanne  - NCCR Lives
> > Institut des Sciences Sociales
> > Btiment Geopolis - 5621
> > 1015 Lausanne
> > Switzerland
> >
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list