[R-sig-ME] Struggling (probably problem in wetware not software) with multilevel logistic regression
Chris Evans
chr|@ho|d @end|ng |rom p@yctc@org
Fri May 1 20:37:59 CEST 2020
I am sorry if this is embarrassingly stupid/easy but I am failing to understand something I thought would be fairly easy.
The question of interest to me and others is whether people omit some items of a questionnaire more than others.
To give a simple example I have several datasets for a ten item self-report questionnaire where one item is about risk
and where three items are cued positively and the other seven are cued negatively.
Here is an example of some such data:
ID Item Score missing missRand missReg riskItem posItem
<chr> <chr> <dbl> <dbl> <int> <int> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0 0 0 0 0
2 ALCA0001 YPCORE02 0 0 0 0 0 0
3 ALCA0001 YPCORE03 0 0 0 0 0 1
4 ALCA0001 YPCORE04 0 0 0 0 1 0
5 ALCA0001 YPCORE05 0 0 0 0 0 1
6 ALCA0001 YPCORE06 1 0 0 0 0 0
7 ALCA0001 YPCORE07 0 0 0 0 0 0
8 ALCA0001 YPCORE08 0 0 0 0 0 0
9 ALCA0001 YPCORE09 0 0 0 0 0 0
10 ALCA0001 YPCORE10 0 0 0 0 0 1
11 ALCA0004 YPCORE01 1 0 0 0 0 0
12 ALCA0004 YPCORE02 0 0 0 0 0 0
13 ALCA0004 YPCORE03 0 0 0 0 0 1
14 ALCA0004 YPCORE04 0 0 0 0 1 0
15 ALCA0004 YPCORE05 0 0 0 0 0 1
missing is the observed omission of the item (1 = omitted)
missRand is a binomial random with probability = .5 I created
missReg is a biomial random with probability = item index number / 11, i.e. strong relationship with item
riskItem identifies the risk items (item 4)
posItem identifies the positively cued items (3, 5 and 10)
For this particular dataset there are 237 participants. (Like Groucho Marks and principles, I have
others if you don't like this one!)
The catch is that each participant has only completed the measure once. It is a plausible model that
participants will vary in their propensity to omit items, it is also highly plausible that participants
might omit the risk item more than the other item. It's a question of interest (though I think not of
huge interest nor would I bet on it (!) that the positively cued items might be omitted more than the
negatively cued ones or v.v.
I thought I could test for these effects using glmer(). That works for
> fitRisk <- glmer(missing ~ riskItem + (1 | ID), family = binomial, data = longDat)
> summary(fitRisk)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: missing ~ riskItem + (1 | ID)
Data: longDat
AIC BIC logLik deviance df.resid
51.7 69.0 -22.9 45.7 2367
Scaled residuals:
Min 1Q Median 3Q Max
-0.52020 -0.00113 -0.00113 -0.00113 2.93829
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 205.3 14.33
Number of obs: 2370, groups: ID, 237
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -13.563 2.540 -5.339 9.33e-08 ***
riskItem -2.418 3.745 -0.646 0.518
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
riskItem 0.064
No effect but small dataset and test of principle works. However, when I come to test the overall difference by items I hit
5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
BITE23 BITE24 BITE25 BITE26 BITE27 BITE28 BITE29 BITE30 BITE31 BITE32 BITE33 Notas origRow Centro2 transfer one
1 NA NA NA NA NA NA NA NA NA NA NA <NA> 43 Urgell 0 1
2 NA NA NA NA NA NA NA NA NA NA NA <NA> 68 Urgell 0 1
3 NA NA NA NA NA NA NA NA NA NA NA <NA> 69 Urgell 0 1
4 NA NA NA NA NA NA NA NA NA NA NA <NA> 70 Urgell 0 1
5 NA NA NA NA NA NA NA NA NA NA NA <NA> 71 Urgell 0 1
Exclude quaireN ID.quaireN Nquaires ID.episodeN episodeN nEpisodes Fecha.inicio Fecha.fin ExpCOREOM ExpSFA ExpSFB
1 0 5 URGE0003-5 27 URGE0003-001 1 1 2017-11-13 2019-02-12 NA NA NA
2 0 3 URGE0004-3 36 URGE0004-001 1 2 2017-11-21 2018-09-10 NA NA NA
3 0 4 URGE0004-4 36 URGE0004-001 1 2 2017-11-21 2018-09-10 NA NA NA
4 0 5 URGE0004-5 36 URGE0004-001 1 2 2017-11-21 2018-09-10 NA NA NA
5 0 6 URGE0004-6 36 URGE0004-001 1 2 2017-11-21 2018-09-10 NA NA NA
ExpYPCORE ExpEAT ExpBITE missingCOREOM missingSFA missingSFB missingYP missingEAT missingBITE impossCOREOM impossSFA
1 NA NA NA NA NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA NA NA NA NA
impossSFB impossYP impossEAT impossBITE
1 NA NA NA NA
2 NA NA NA NA
3 NA NA NA NA
4 NA NA NA NA
5 NA NA NA NA
[ reached 'max' / getOption("max.print") -- omitted 672 rows ]
> datPrincip_URGE %>%
+ arrange(Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ dim()
[1] 677 195
> datPrincip_URGE %>%
+ arrange(Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ dim()
[1] 61 197
> dat <- bind_rows(datPrincip_ALCA,
+ datPrincip_URGE)
> dat %>%
+ arrange(Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ dim()
[1] 118 197
> dat <- bind_rows(datPrincip_ALCA,
+ datPrincip_ARGE,
+ datPrincip_AVEN,
+ datPrincip_MOSC,
+ datPrincip_SABA,
+ datPrincip_SEVI,
+ datPrincip_TARR,
+ datPrincip_URGE)
> dat %>%
+ arrange(Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ dim()
[1] 237 197
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ dim()
[1] 237 197
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, starts_with("YP")) %>%
+ dim()
[1] 237 11
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, starts_with("YP")) %>%
+ pivot_longer(cols = starts_with("YPCORE")) %>%
+ dim()
[1] 2370 3
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, starts_with("YP")) %>%
+ pivot_longer(cols = starts_with("YPCORE")) %>%
+ mutate(missing = if_else(!is.na(value), 0, 1)) -> longDat
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [1]
ID name value missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0001 YPCORE03 0 0
4 ALCA0001 YPCORE04 0 0
5 ALCA0001 YPCORE05 0 0
6 ALCA0001 YPCORE06 1 0
> table(longDat$missing)
0 1
2356 14
> ?pivot_longer
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, starts_with("YP")) %>%
+ pivot_longer(cols = starts_with("YPCORE"),
+ names_to = "Item") %>%
+ mutate(missing = if_else(!is.na(value), 0, 1)) -> longDat
> head(longD)
Error in head(longD) : object 'longD' not found
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [1]
ID Item value missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0001 YPCORE03 0 0
4 ALCA0001 YPCORE04 0 0
5 ALCA0001 YPCORE05 0 0
6 ALCA0001 YPCORE06 1 0
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, starts_with("YP")) %>%
+ pivot_longer(cols = starts_with("YPCORE"),
+ names_to = "Item",
+ values_to = "Score") %>%
+ mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [1]
ID Item Score missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0001 YPCORE03 0 0
4 ALCA0001 YPCORE04 0 0
5 ALCA0001 YPCORE05 0 0
6 ALCA0001 YPCORE06 1 0
> formul1 <- missing ~ Item + Item | ID
> fit <- glmer(formul1, family = binomial, data = longDat)
Error in glmer(formul1, family = binomial, data = longDat) :
could not find function "glmer"
> library(lme4)
Loading required package: Matrix
Attaching package: ‘Matrix’
The following objects are masked from ‘package:tidyr’:
expand, pack, unpack
> formul1 <- missing ~ Item + Item | ID
> fit <- glmer(formul1, family = binomial, data = longDat)
boundary (singular) fit: see ?isSingular
Warning messages:
1: In commonArgs(par, fn, control, environment()) :
maxfun < 10 * length(par)^2 is not recommended.
2: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf, :
failure to converge in 10000 evaluations
> formul1 <- missing ~ Item | ID
> fit <- glmer(formul1, family = binomial, data = longDat)
boundary (singular) fit: see ?isSingular
Warning messages:
1: In commonArgs(par, fn, control, environment()) :
maxfun < 10 * length(par)^2 is not recommended.
2: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf, :
failure to converge in 10000 evaluations
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [1]
ID Item Score missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0001 YPCORE03 0 0
4 ALCA0001 YPCORE04 0 0
5 ALCA0001 YPCORE05 0 0
6 ALCA0001 YPCORE06 1 0
> ?glmer
> ?mcnemar.test
> table(is.na(dat$YPCORE01), is.na(dat$YPCORE02))
FALSE TRUE
FALSE 3573 4
TRUE 1 9499
> mcnemar.test(table(is.na(dat$YPCORE01), is.na(dat$YPCORE02)))
McNemar's Chi-squared test with continuity correction
data: table(is.na(dat$YPCORE01), is.na(dat$YPCORE02))
McNemar's chi-squared = 0.8, df = 1, p-value = 0.3711
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, starts_with("YP")) %>%
+ pivot_longer(cols = c("YPCORE01", "YPCORE02")),
Error: unexpected ',' in:
" select(ID, starts_with("YP")) %>%
pivot_longer(cols = c("YPCORE01", "YPCORE02")),"
> names_to = "Item",
Error: unexpected ',' in " names_to = "Item","
> values_to = "Score") %>%
Error: unexpected ')' in " values_to = "Score")"
> mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
Error in if_else(!is.na(Score), 0, 1) : object 'Score' not found
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [1]
ID Item Score missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0001 YPCORE03 0 0
4 ALCA0001 YPCORE04 0 0
5 ALCA0001 YPCORE05 0 0
6 ALCA0001 YPCORE06 1 0
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, starts_with("YP")) %>%
+ pivot_longer(cols = c("YPCORE01", "YPCORE02"),
+ names_to = "Item",
+ values_to = "Score") %>%
+ mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> head(longDat)
# A tibble: 6 x 12
# Groups: ID [3]
ID YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07 YPCORE08 YPCORE09 YPCORE10 Item Score missing
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 ALCA0001 0 0 0 1 0 0 0 0 YPCORE01 1 0
2 ALCA0001 0 0 0 1 0 0 0 0 YPCORE02 0 0
3 ALCA0004 0 0 0 1 0 0 0 0 YPCORE01 1 0
4 ALCA0004 0 0 0 1 0 0 0 0 YPCORE02 0 0
5 ALCA0007 1 0 0 1 0 2 1 0 YPCORE01 2 0
6 ALCA0007 1 0 0 1 0 2 1 0 YPCORE02 1 0
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, c("YPCORE01", "YPCORE02")) %>%
+ pivot_longer(cols = c("YPCORE01", "YPCORE02"),
+ names_to = "Item",
+ values_to = "Score") %>%
+ mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [3]
ID Item Score missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0004 YPCORE01 1 0
4 ALCA0004 YPCORE02 0 0
5 ALCA0007 YPCORE01 2 0
6 ALCA0007 YPCORE02 1 0
> fit <- glmer(formul1, family = binomial, data = longDat)
boundary (singular) fit: see ?isSingular
> ?isSingular
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [3]
ID Item Score missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0004 YPCORE01 1 0
4 ALCA0004 YPCORE02 0 0
5 ALCA0007 YPCORE01 2 0
6 ALCA0007 YPCORE02 1 0
> table(longDat$missing)
0 1
472 2
> library(lme4)
> formul1 <- missing ~ (Item | ID)
> fit <- glmer(formul1, family = binomial, data = longDat)
boundary (singular) fit: see ?isSingular
> longDat$missRand <- rbinom(1, length(longDat), .5)
> head(longDat)
# A tibble: 6 x 5
# Groups: ID [3]
ID Item Score missing missRand
<chr> <chr> <dbl> <dbl> <int>
1 ALCA0001 YPCORE01 1 0 2
2 ALCA0001 YPCORE02 0 0 2
3 ALCA0004 YPCORE01 1 0 2
4 ALCA0004 YPCORE02 0 0 2
5 ALCA0007 YPCORE01 2 0 2
6 ALCA0007 YPCORE02 1 0 2
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 0
> ?rbinom
> rbinom(1,1,.05)
[1] 0
> rbinom(1,1,.05)
[1] 0
> rbinom(1,1,.95)
[1] 1
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.95)
[1] 1
> rbinom(1,1,.5)
[1] 1
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 1
> rbinom(1,1,.5)
[1] 0
> rbinom(1,1,.5)
[1] 1
> rbinom(1,2,.5)
[1] 2
> rbinom(2,1,.5)
[1] 0 0
> rbinom(2,1,.5)
[1] 0 0
> rbinom(2,1,.5)
[1] 0 0
> rbinom(2,1,.5)
[1] 0 1
> rbinom(2,1,.5)
[1] 1 1
> rbinom(2,1,.5)
[1] 0 1
> longDat$missRand <- rbinom(length(longDat), 1, .5)
Error: Assigned data `rbinom(length(longDat), 1, 0.5)` must be compatible with existing data.
x Existing data has 474 rows.
x Assigned data has 5 rows.
ℹ Only vectors of size 1 are recycled.
Run `rlang::last_error()` to see where the error occurred.
> table(longDat$missing)
0 1
472 2
> library(lme4)
> head(longDat)
# A tibble: 6 x 5
# Groups: ID [3]
ID Item Score missing missRand
<chr> <chr> <dbl> <dbl> <int>
1 ALCA0001 YPCORE01 1 0 2
2 ALCA0001 YPCORE02 0 0 2
3 ALCA0004 YPCORE01 1 0 2
4 ALCA0004 YPCORE02 0 0 2
5 ALCA0007 YPCORE01 2 0 2
6 ALCA0007 YPCORE02 1 0 2
> rbinom(2,1,.5)
[1] 1 0
> rbinom(2,1,.5)
[1] 1 1
> rbinom(2,1,.5)
[1] 1 1
> longDat$missRand <- NULL
> longDat$missRand <- rbinom(length(longDat), 1, .5)
Error: Assigned data `rbinom(length(longDat), 1, 0.5)` must be compatible with existing data.
x Existing data has 474 rows.
x Assigned data has 4 rows.
ℹ Only vectors of size 1 are recycled.
Run `rlang::last_error()` to see where the error occurred.
> longDat$missRand <- rbinom(nrow(longDat), 1, .5)
> table(longDat$missRand)
0 1
240 234
> head(longDat)
# A tibble: 6 x 5
# Groups: ID [3]
ID Item Score missing missRand
<chr> <chr> <dbl> <dbl> <int>
1 ALCA0001 YPCORE01 1 0 0
2 ALCA0001 YPCORE02 0 0 1
3 ALCA0004 YPCORE01 1 0 0
4 ALCA0004 YPCORE02 0 0 1
5 ALCA0007 YPCORE01 2 0 0
6 ALCA0007 YPCORE02 1 0 0
> formul1 <- missRand ~ (Item | ID)
> fit <- glmer(formul1, family = binomial, data = longDat)
boundary (singular) fit: see ?isSingular
> fit <- glmer(missRand ~ Item, family = binomial, data = longDat)
Error: No random effects terms specified in formula
> fit <- glmer(missRand ~ Item | ID, family = binomial, data = longDat)
boundary (singular) fit: see ?isSingular
> fit <- glmer(factor(missRand) ~ Item | ID, family = binomial, data = longDat)
boundary (singular) fit: see ?isSingular
> fit <- glmer(missRand ~ Item | ID, family = "logistic", data = longDat)
Error in get(family, mode = "function", envir = parent.frame(2)) :
object 'logistic' of mode 'function' was not found
> fit <- glmer(missRand ~ Item + (1 | ID), family = "binomial", data = longDat)
boundary (singular) fit: see ?isSingular
> ?glmer
> longDat
# A tibble: 474 x 5
# Groups: ID [237]
ID Item Score missing missRand
<chr> <chr> <dbl> <dbl> <int>
1 ALCA0001 YPCORE01 1 0 0
2 ALCA0001 YPCORE02 0 0 1
3 ALCA0004 YPCORE01 1 0 0
4 ALCA0004 YPCORE02 0 0 1
5 ALCA0007 YPCORE01 2 0 0
6 ALCA0007 YPCORE02 1 0 0
7 ALCA0012 YPCORE01 3 0 1
8 ALCA0012 YPCORE02 1 0 1
9 ALCA0013 YPCORE01 4 0 1
10 ALCA0013 YPCORE02 3 0 1
# … with 464 more rows
> table(longDat$missRand, longDat$Item)
YPCORE01 YPCORE02
0 125 115
1 112 122
>
> fit <- glmer(missRand ~ factor(Item) + (1 | ID), family = "binomial", data = longDat)
boundary (singular) fit: see ?isSingular
> fit <- glmer(missRand ~ factor(Item) + (1 | factor(ID)), family = "binomial", data = longDat)
Error: couldn't evaluate grouping factor factor(ID) within model frame: try adding grouping factor to data frame explicitly if possible
> fit <- glmer(missRand ~ factor(Item) + (1 | ID), family = "binomial", data = longDat)
boundary (singular) fit: see ?isSingular
> rm(fit)
> fit <- glmer(missRand ~ factor(Item) + (1 | ID), family = "binomial", data = longDat)
boundary (singular) fit: see ?isSingular
> fit
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: missRand ~ factor(Item) + (1 | ID)
Data: longDat
AIC BIC logLik deviance df.resid
662.1833 674.6669 -328.0917 656.1833 471
Random effects:
Groups Name Std.Dev.
ID (Intercept) 1.943e-07
Number of obs: 474, groups: ID, 237
Fixed Effects:
(Intercept) factor(Item)YPCORE02
-0.1098 0.1689
convergence code 0; 0 optimizer warnings; 1 lme4 warnings
> dat %>%
+ arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
+ filter(Cuestionarios == "YP-CORE" &
+ Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
+ group_by(ID) %>%
+ mutate(occasion = row_number(),
+ first = if_else(occasion == 1, 1, 0)) %>% # to get first ocassion only
+ filter(first == 1) %>%
+ select(ID, starts_with("YPCORE")) %>%
+ pivot_longer(cols = starts_with("YPCORE"),
+ names_to = "Item",
+ values_to = "Score") %>%
+ mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [1]
ID Item Score missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0001 YPCORE03 0 0
4 ALCA0001 YPCORE04 0 0
5 ALCA0001 YPCORE05 0 0
6 ALCA0001 YPCORE06 1 0
> formul1 <- missing ~ (Item | ID)
> fit <- glmer(missing ~ factor(Item) + (1 | ID), family = "binomial", data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 1.78883 (tol = 0.002, component 1)
> fit <- glmer((score == 1) ~ factor(Item) + (1 | ID), family = "binomial", data = longDat)
Error in eval(predvars, data, env) : object 'score' not found
> fit <- glmer((Score == 1) ~ factor(Item) + (1 | ID), family = "binomial", data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00309555 (tol = 0.002, component 1)
> fit <- glmer((Score == 1) ~ factor(Item) + (factor(Item) | ID), family = "binomial", data = longDat)
Error: number of observations (=2356) < number of random effects (=2360) for term (factor(Item) | ID); the random-effects parameters are probably unidentifiable
> fit <- glmer((Score == 1) ~ factor(Item) + (Item | ID), family = "binomial", data = longDat)
Error: number of observations (=2356) < number of random effects (=2360) for term (Item | ID); the random-effects parameters are probably unidentifiable
> fit <- glmer((Score == 1) ~ factor(Item) + (1 | ID), family = "binomial", data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00309555 (tol = 0.002, component 1)
> head(longDat)
# A tibble: 6 x 4
# Groups: ID [1]
ID Item Score missing
<chr> <chr> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0
2 ALCA0001 YPCORE02 0 0
3 ALCA0001 YPCORE03 0 0
4 ALCA0001 YPCORE04 0 0
5 ALCA0001 YPCORE05 0 0
6 ALCA0001 YPCORE06 1 0
> longDat$missRand <- rbinom(nrow(longDat), 1, .5)
> table(longDat$missRand)
0 1
1211 1159
> fit <- glmer(missRand ~ Item + (1 | ID), family = "binomial", data = longDat)
boundary (singular) fit: see ?isSingular
> fit
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: missRand ~ Item + (1 | ID)
Data: longDat
AIC BIC logLik deviance df.resid
3300.819 3364.296 -1639.410 3278.819 2359
Random effects:
Groups Name Std.Dev.
ID (Intercept) 2.663e-07
Number of obs: 2370, groups: ID, 237
Fixed Effects:
(Intercept) ItemYPCORE02 ItemYPCORE03 ItemYPCORE04 ItemYPCORE05 ItemYPCORE06 ItemYPCORE07 ItemYPCORE08
-5.909e-02 1.013e-01 -1.690e-02 1.182e-01 6.753e-02 -1.186e-01 -1.046e-15 -1.186e-01
ItemYPCORE09 ItemYPCORE10
1.858e-01 -6.766e-02
convergence code 0; 0 optimizer warnings; 1 lme4 warnings
> anova(fit)
Analysis of Variance Table
npar Sum Sq Mean Sq F value
Item 9 5.5453 0.61615 0.6161
> table(longDat$Item, as.numeric(longDat$Item))
< table of extent 10 x 0 >
Warning message:
In table(longDat$Item, as.numeric(longDat$Item)) :
NAs introduced by coercion
> dim(longDat)
[1] 2370 5
> head(longDat)
# A tibble: 6 x 5
# Groups: ID [1]
ID Item Score missing missRand
<chr> <chr> <dbl> <dbl> <int>
1 ALCA0001 YPCORE01 1 0 1
2 ALCA0001 YPCORE02 0 0 1
3 ALCA0001 YPCORE03 0 0 1
4 ALCA0001 YPCORE04 0 0 1
5 ALCA0001 YPCORE05 0 0 1
6 ALCA0001 YPCORE06 1 0 0
> table(longDat$Item) #, as.numeric(longDat$Item))
YPCORE01 YPCORE02 YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07 YPCORE08 YPCORE09 YPCORE10
237 237 237 237 237 237 237 237 237 237
> table(as.numeric(longDat$Item)) #, as.numeric(longDat$Item))
< table of extent 0 >
Warning message:
In table(as.numeric(longDat$Item)) : NAs introduced by coercion
> table(longDat$Item, as.numeric(factor(longDat$Item)))
1 2 3 4 5 6 7 8 9 10
YPCORE01 237 0 0 0 0 0 0 0 0 0
YPCORE02 0 237 0 0 0 0 0 0 0 0
YPCORE03 0 0 237 0 0 0 0 0 0 0
YPCORE04 0 0 0 237 0 0 0 0 0 0
YPCORE05 0 0 0 0 237 0 0 0 0 0
YPCORE06 0 0 0 0 0 237 0 0 0 0
YPCORE07 0 0 0 0 0 0 237 0 0 0
YPCORE08 0 0 0 0 0 0 0 237 0 0
YPCORE09 0 0 0 0 0 0 0 0 237 0
YPCORE10 0 0 0 0 0 0 0 0 0 237
> longDat %>%
+ mutate(missRand = rbinom(1, 1, .5))
# A tibble: 2,370 x 5
# Groups: ID [237]
ID Item Score missing missRand
<chr> <chr> <dbl> <dbl> <int>
1 ALCA0001 YPCORE01 1 0 1
2 ALCA0001 YPCORE02 0 0 1
3 ALCA0001 YPCORE03 0 0 1
4 ALCA0001 YPCORE04 0 0 1
5 ALCA0001 YPCORE05 0 0 1
6 ALCA0001 YPCORE06 1 0 1
7 ALCA0001 YPCORE07 0 0 1
8 ALCA0001 YPCORE08 0 0 1
9 ALCA0001 YPCORE09 0 0 1
10 ALCA0001 YPCORE10 0 0 1
# … with 2,360 more rows
> table(longDat$missRand)
0 1
1211 1159
> longDat %>%
+ mutate(missRand = rbinom(1, 1, .5),
+ missReg = rbinom(1, 1, as.numeric(factor(Item))/11))
# A tibble: 2,370 x 6
# Groups: ID [237]
ID Item Score missing missRand missReg
<chr> <chr> <dbl> <dbl> <int> <int>
1 ALCA0001 YPCORE01 1 0 1 0
2 ALCA0001 YPCORE02 0 0 1 0
3 ALCA0001 YPCORE03 0 0 1 0
4 ALCA0001 YPCORE04 0 0 1 0
5 ALCA0001 YPCORE05 0 0 1 0
6 ALCA0001 YPCORE06 1 0 1 0
7 ALCA0001 YPCORE07 0 0 1 0
8 ALCA0001 YPCORE08 0 0 1 0
9 ALCA0001 YPCORE09 0 0 1 0
10 ALCA0001 YPCORE10 0 0 1 0
# … with 2,360 more rows
> table(longDat$missReg)
< table of extent 0 >
Warning message:
Unknown or uninitialised column: `missReg`.
> longDat %>%
+ mutate(missRand = rbinom(1, 1, .5),
+ missReg = rbinom(1, 1, as.numeric(factor(Item))/11)) -> longDat
> table(longDat$missReg)
0 1
2120 250
> fit <- glmer(missReg ~ Item + (1 | ID), family = "binomial", data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 3.81902 (tol = 0.002, component 1)
> longDat %>%
+ mutate(missRand = rbinom(1, 1, .5),
+ missReg = rbinom(1, 1, as.numeric(factor(Item))/11)) -> longDat
> head(longDat)
# A tibble: 6 x 6
# Groups: ID [1]
ID Item Score missing missRand missReg
<chr> <chr> <dbl> <dbl> <int> <int>
1 ALCA0001 YPCORE01 1 0 1 1
2 ALCA0001 YPCORE02 0 0 1 1
3 ALCA0001 YPCORE03 0 0 1 1
4 ALCA0001 YPCORE04 0 0 1 1
5 ALCA0001 YPCORE05 0 0 1 1
6 ALCA0001 YPCORE06 1 0 1 1
> table(longDat$missRand)
0 1
1140 1230
> table(longDat$missReg)
0 1
2160 210
> library(lme4)
> fitRand <- glmer(missRand ~ Item + (1 | ID), family = "binomial", data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 5.46528 (tol = 0.002, component 1)
> anova(fitRand)
Analysis of Variance Table
npar Sum Sq Mean Sq F value
Item 9 0.050372 0.0055969 0.0056
> length(unique(longDat$ID))
[1] 237
> rm(fitRand)
> anova(fitRand)
Error in anova(fitRand) : object 'fitRand' not found
> fitRand <- glmer(missRand ~ Item + (1 | ID), family = "binomial", data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 5.46528 (tol = 0.002, component 1)
> fitRand <- glmer(missRand ~ Item + (1 | ID), family = binomial, data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 5.46528 (tol = 0.002, component 1)
> fitRand <- glmer(missRand ~ Item + (1 | ID), family = binomial(link = "logit"), data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 5.46528 (tol = 0.002, component 1)
> ?glmer
> data("cbpp")
> dim(cbpp)
[1] 56 4
> head(cbpp)
herd incidence size period
1 1 2 14 1
2 1 3 12 2
3 1 4 9 3
4 1 0 5 4
5 2 3 22 1
6 2 1 18 2
> fitRand <- glmer(cbind(missRand, 1) ~ Item + (1 | ID), family = binomial(link = "logit"), data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00550177 (tol = 0.002, component 1)
> fitReg <- glmer(cbind(missReg, 1) ~ Item + (1 | ID), family = "binomial", data = longDat)
Warning messages:
1: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf, :
failure to converge in 10000 evaluations
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 1.47499 (tol = 0.002, component 1)
> fitReal <- glmer(cbind(missing, 1) ~ Item + (1 | ID), family = "binomial", data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 1.29013 (tol = 0.002, component 1)
> data(mlmdata)
Warning message:
In data(mlmdata) : data set ‘mlmdata’ not found
> library(haven)
> mlmdata <- read_dta("https://stats.idre.ucla.edu/stat/examples/imm/imm10.dta")
> head(mlmdata)
# A tibble: 6 x 19
schid stuid ses meanses homework white parented public ratio percmin math sex race sctype cstr scsize urban
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 7472 3 -0.130 -0.483 1 1 2 1 19 0 48 2 4 1 2 3 2
2 7472 8 -0.390 -0.483 0 1 2 1 19 0 48 1 4 1 2 3 2
3 7472 13 -0.800 -0.483 0 1 2 1 19 0 53 1 4 1 2 3 2
4 7472 17 -0.720 -0.483 1 1 2 1 19 0 42 1 4 1 2 3 2
5 7472 27 -0.740 -0.483 2 1 2 1 19 0 43 2 4 1 2 3 2
6 7472 28 -0.580 -0.483 1 1 2 1 19 0 57 2 4 1 2 3 2
# … with 2 more variables: region <dbl>, schnum <dbl>
> model <- glmer(white ~ homework + (1 + homework | schid), data=mlmdata,
+ family=binomial(link="logit"))
boundary (singular) fit: see ?isSingular
> summary(model)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: white ~ homework + (1 + homework | schid)
Data: mlmdata
AIC BIC logLik deviance df.resid
182.4 200.2 -86.2 172.4 255
Scaled residuals:
Min 1Q Median 3Q Max
-4.3373 -0.1184 0.1112 0.3421 3.8801
Random effects:
Groups Name Variance Std.Dev. Corr
schid (Intercept) 16.28745 4.0358
homework 0.04678 0.2163 -1.00
Number of obs: 260, groups: schid, 10
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.67365 1.47324 1.136 0.256
homework 0.04508 0.18433 0.245 0.807
Correlation of Fixed Effects:
(Intr)
homework -0.590
convergence code: 0
boundary (singular) fit: see ?isSingular
> dim(mlmdata)
[1] 260 19
> head(mlmdata)
# A tibble: 6 x 19
schid stuid ses meanses homework white parented public ratio percmin math sex race sctype cstr scsize urban
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 7472 3 -0.130 -0.483 1 1 2 1 19 0 48 2 4 1 2 3 2
2 7472 8 -0.390 -0.483 0 1 2 1 19 0 48 1 4 1 2 3 2
3 7472 13 -0.800 -0.483 0 1 2 1 19 0 53 1 4 1 2 3 2
4 7472 17 -0.720 -0.483 1 1 2 1 19 0 42 1 4 1 2 3 2
5 7472 27 -0.740 -0.483 2 1 2 1 19 0 43 2 4 1 2 3 2
6 7472 28 -0.580 -0.483 1 1 2 1 19 0 57 2 4 1 2 3 2
# … with 2 more variables: region <dbl>, schnum <dbl>
> length(unique(mlmdata$schid))
[1] 10
> table(white)
Error in table(white) : object 'white' not found
> table(mlmdata$white)
0 1
71 189
> fitRand <- glmer(missRand ~ Item + (1 | ID), family = binomial(link = "logit"), data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 5.46528 (tol = 0.002, component 1)
> ,
Error: unexpected ',' in ","
> longDat %>%
+ mutate(missRand = rbinom(1, 1, .5),
+ missReg = rbinom(1, 1, as.numeric(factor(Item))/11),
+ riskItem = if_else(Item == "YPCORE04", 1, 0),
+ posItem = if_else(Item %in% c("YPCORE03", "YPCORE5", "YPCORE10"), 1, 0)) -> longDat
> head(longDat)
# A tibble: 6 x 8
# Groups: ID [1]
ID Item Score missing missRand missReg riskItem posItem
<chr> <chr> <dbl> <dbl> <int> <int> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0 0 0 0 0
2 ALCA0001 YPCORE02 0 0 0 0 0 0
3 ALCA0001 YPCORE03 0 0 0 0 0 1
4 ALCA0001 YPCORE04 0 0 0 0 1 0
5 ALCA0001 YPCORE05 0 0 0 0 0 0
6 ALCA0001 YPCORE06 1 0 0 0 0 0
> table(longDat$riskItem, longDat$posItem)
0 1
0 1659 474
1 237 0
> table(longDat$riskItem, longDat$Item)
YPCORE01 YPCORE02 YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07 YPCORE08 YPCORE09 YPCORE10
0 237 237 237 0 237 237 237 237 237 237
1 0 0 0 237 0 0 0 0 0 0
> table(longDat$posItem, longDat$Item)
YPCORE01 YPCORE02 YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07 YPCORE08 YPCORE09 YPCORE10
0 237 237 0 237 237 237 237 237 237 0
1 0 0 237 0 0 0 0 0 0 237
> longDat %>%
+ mutate(missRand = rbinom(1, 1, .5),
+ missReg = rbinom(1, 1, as.numeric(factor(Item))/11),
+ riskItem = if_else(Item == "YPCORE04", 1, 0),
+ posItem = if_else(Item %in% c("YPCORE03", "YPCORE05", "YPCORE10"), 1, 0)) -> longDat
> head(longDat)
# A tibble: 6 x 8
# Groups: ID [1]
ID Item Score missing missRand missReg riskItem posItem
<chr> <chr> <dbl> <dbl> <int> <int> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0 0 0 0 0
2 ALCA0001 YPCORE02 0 0 0 0 0 0
3 ALCA0001 YPCORE03 0 0 0 0 0 1
4 ALCA0001 YPCORE04 0 0 0 0 1 0
5 ALCA0001 YPCORE05 0 0 0 0 0 1
6 ALCA0001 YPCORE06 1 0 0 0 0 0
> table(longDat$posItem, longDat$Item)
YPCORE01 YPCORE02 YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07 YPCORE08 YPCORE09 YPCORE10
0 237 237 0 237 0 237 237 237 237 0
1 0 0 237 0 237 0 0 0 0 237
> fitRisk <- glmer(missing ~ riskItem + (1 | ID), family = binomial, data = longDat)
> summary(fitRisk)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: missing ~ riskItem + (1 | ID)
Data: longDat
AIC BIC logLik deviance df.resid
51.7 69.0 -22.9 45.7 2367
Scaled residuals:
Min 1Q Median 3Q Max
-0.52020 -0.00113 -0.00113 -0.00113 2.93829
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 205.3 14.33
Number of obs: 2370, groups: ID, 237
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -13.563 2.540 -5.339 9.33e-08 ***
riskItem -2.418 3.745 -0.646 0.518
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
riskItem 0.064
> fitRisk <- glmer(missing ~ posItem + (1 | ID), family = binomial, data = longDat)
> fitRisk <- glmer(missing ~ riskItem + (1 | ID), family = binomial, data = longDat)
> summary(fitRisk)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: missing ~ riskItem + (1 | ID)
Data: longDat
AIC BIC logLik deviance df.resid
51.7 69.0 -22.9 45.7 2367
Scaled residuals:
Min 1Q Median 3Q Max
-0.52020 -0.00113 -0.00113 -0.00113 2.93829
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 205.3 14.33
Number of obs: 2370, groups: ID, 237
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -13.563 2.540 -5.339 9.33e-08 ***
riskItem -2.418 3.745 -0.646 0.518
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
riskItem 0.064
> fitPos <- glmer(missing ~ posItem + (1 | ID), family = binomial, data = longDat)
> summary(fitPos)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: missing ~ posItem + (1 | ID)
Data: longDat
AIC BIC logLik deviance df.resid
52.4 69.7 -23.2 46.4 2367
Scaled residuals:
Min 1Q Median 3Q Max
-0.5111 -0.0012 -0.0012 -0.0010 3.4498
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 197.5 14.05
Number of obs: 2370, groups: ID, 237
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -13.5152 2.5280 -5.346 8.98e-08 ***
posItem -0.2953 1.2396 -0.238 0.812
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
posItem -0.110
> head(longDat,15)
# A tibble: 15 x 8
# Groups: ID [2]
ID Item Score missing missRand missReg riskItem posItem
<chr> <chr> <dbl> <dbl> <int> <int> <dbl> <dbl>
1 ALCA0001 YPCORE01 1 0 0 0 0 0
2 ALCA0001 YPCORE02 0 0 0 0 0 0
3 ALCA0001 YPCORE03 0 0 0 0 0 1
4 ALCA0001 YPCORE04 0 0 0 0 1 0
5 ALCA0001 YPCORE05 0 0 0 0 0 1
6 ALCA0001 YPCORE06 1 0 0 0 0 0
7 ALCA0001 YPCORE07 0 0 0 0 0 0
8 ALCA0001 YPCORE08 0 0 0 0 0 0
9 ALCA0001 YPCORE09 0 0 0 0 0 0
10 ALCA0001 YPCORE10 0 0 0 0 0 1
11 ALCA0004 YPCORE01 1 0 0 0 0 0
12 ALCA0004 YPCORE02 0 0 0 0 0 0
13 ALCA0004 YPCORE03 0 0 0 0 0 1
14 ALCA0004 YPCORE04 0 0 0 0 1 0
15 ALCA0004 YPCORE05 0 0 0 0 0 1
>
>
> rm(fitRand)
> rm(fitReg)
> rm(fitReal)
> fitRand <- glmer(cbind(missRand, 1) ~ Item + (1 | ID), family = binomial(link = "logit"), data = longDat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0501701 (tol = 0.002, component 1)
> fitRand
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: cbind(missRand, 1) ~ Item + (1 | ID)
Data: longDat
AIC BIC logLik deviance df.resid
2215.017 2278.494 -1096.509 2193.017 2359
Random effects:
Groups Name Std.Dev.
ID (Intercept) 2.509
Number of obs: 2370, groups: ID, 237
Fixed Effects:
(Intercept) ItemYPCORE02 ItemYPCORE03 ItemYPCORE04 ItemYPCORE05 ItemYPCORE06 ItemYPCORE07 ItemYPCORE08
-2.512456 0.004360 0.007166 0.001288 0.004078 0.002322 0.005059 0.006118
ItemYPCORE09 ItemYPCORE10
0.006454 0.005057
convergence code 0; 0 optimizer warnings; 1 lme4 warnings
So failed to converge but does give a set of estimates. I think I can use the
> fitReg <- glmer(cbind(missReg, 1) ~ Item + (1 | ID), family = "binomial", data = longDat)
Warning messages:
1: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf, :
failure to converge in 10000 evaluations
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 1.66664 (tol = 0.002, component 1)
> fitReg
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: cbind(missReg, 1) ~ Item + (1 | ID)
Data: longDat
AIC BIC logLik deviance df.resid
605.3404 668.8175 -291.6702 583.3404 2359
Random effects:
Groups Name Std.Dev.
ID (Intercept) 6.775
Number of obs: 2370, groups: ID, 237
Fixed Effects:
(Intercept) ItemYPCORE02 ItemYPCORE03 ItemYPCORE04 ItemYPCORE05 ItemYPCORE06 ItemYPCORE07 ItemYPCORE08
-8.6607 -0.3793 -0.3159 -0.3591 -0.4608 -0.4453 -0.4165 -0.6597
ItemYPCORE09 ItemYPCORE10
-0.4304 -0.3535
convergence code 0; 1 optimizer warnings; 1 lme4 warnings
I am starting to realise the depths of my ignorance about all this. It's clear to me that having only one observation per cell
because of the nesting of items within participants and only having one completion per participant is causing the convergence
issues (which makes sense though I have a sense that there might be ways to extract estimates despite this: this is beyond me!)
I'm puzzled that I get slightly different results for the risk analysis if I use the "cbind(missing, 1) ~ " syntax in the formula
from those I get using just "missing ~ " and different max|grad values from the nonconvergence message depending on that choice.
I suspect that's a red herring.
I have seen many comments here recently about non-convergence and ways to overcome it by specifying different methods of
approximation (if I understood that properly) but they generally seemed to be for much more complex models than mine and
probably weren't about this "cell size = 1" issue. I have searched for answer or illumination for some hours and perhaps
I'm asking the wrong questions but I'm stuck. (I think it's very unlikely that I have access to trained statisticians by
virtue of my honorary or paid academic positions!)
So my questions:
1) _is_ there a way to estimate such a model, i.e. estimating an effect across all ten items with only one completion per participant?
2) can someone suggest good reading matter for a dogged non-statistician who can handle a lot of continuous variable issues up to
some real complexity on his own but has never been as comfortable with counts beyond the basic old NHST, single level approaches? I
have an old copy of Pinheiro and Bates but confess that despite trying many times, the older it and I get, the less I'm able to cope
with it. The algebra is just beyond me.
TIA ... and best wishes to all for physical safety and psychological resilience in these times,
Chris
--
Small contribution in our coronavirus rigours:
https://www.coresystemtrust.org.uk/home/free-options-to-replace-paper-core-forms-during-the-coronavirus-pandemic/
Chris Evans <chris using psyctc.org> Visiting Professor, University of Sheffield <chris.evans using sheffield.ac.uk>
I do some consultation work for the University of Roehampton <chris.evans using roehampton.ac.uk> and other places
but <chris using psyctc.org> remains my main Email address. I have a work web site at:
https://www.psyctc.org/psyctc/
and a site I manage for CORE and CORE system trust at:
http://www.coresystemtrust.org.uk/
I have "semigrated" to France, see:
https://www.psyctc.org/pelerinage2016/semigrating-to-france/
https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/
If you want an Emeeting, I am trying to keep them to Thursdays and my diary is at:
https://www.psyctc.org/pelerinage2016/ceworkdiary/
Beware: French time, generally an hour ahead of UK.
More information about the R-sig-mixed-models
mailing list