[R-sig-ME] Struggling (probably problem in wetware not software) with multilevel logistic regression

Juho Kristian Ruohonen juho@kr|@t|@n@ruohonen @end|ng |rom gm@||@com
Sat May 2 14:04:02 CEST 2020


(reposting my response since the first version failed to include the
mailing list among the recipients)

Here’s a non-statistician’s take:

You don’t need a multilevel model for this. Given that every respondent
contributes only one data point, no one’s idiosyncracies are
overrepresented, and we can expect them to cancel each other out. Your
question can be modeled using standard logistic regression.

Since you suspect *posItem* may influence the chance of omission, you
should include it as an additional fixed effect besides *negItem*.
Something like:

mymodel <- glm(missing ~ posItem + riskItem, family = binomial, data =
mydata)

It’ll be interesting to hear whether real statisticians disagree.

Best,

J

pe 1. toukok. 2020 klo 21.38 Chris Evans (chrishold using psyctc.org) kirjoitti:

> I am sorry if this is embarrassingly stupid/easy but I am failing to
> understand something I thought would be fairly easy.
>
> The question of interest to me and others is whether people omit some
> items of a questionnaire more than others.
> To give a simple example I have several datasets for a ten item
> self-report questionnaire where one item is about risk
> and where three items are cued positively and the other seven are cued
> negatively.
>
> Here is an example of some such data:
>
>    ID       Item     Score missing missRand missReg riskItem posItem
>    <chr>    <chr>    <dbl>   <dbl>    <int>   <int>    <dbl>   <dbl>
>  1 ALCA0001 YPCORE01     1       0        0       0        0       0
>  2 ALCA0001 YPCORE02     0       0        0       0        0       0
>  3 ALCA0001 YPCORE03     0       0        0       0        0       1
>  4 ALCA0001 YPCORE04     0       0        0       0        1       0
>  5 ALCA0001 YPCORE05     0       0        0       0        0       1
>  6 ALCA0001 YPCORE06     1       0        0       0        0       0
>  7 ALCA0001 YPCORE07     0       0        0       0        0       0
>  8 ALCA0001 YPCORE08     0       0        0       0        0       0
>  9 ALCA0001 YPCORE09     0       0        0       0        0       0
> 10 ALCA0001 YPCORE10     0       0        0       0        0       1
> 11 ALCA0004 YPCORE01     1       0        0       0        0       0
> 12 ALCA0004 YPCORE02     0       0        0       0        0       0
> 13 ALCA0004 YPCORE03     0       0        0       0        0       1
> 14 ALCA0004 YPCORE04     0       0        0       0        1       0
> 15 ALCA0004 YPCORE05     0       0        0       0        0       1
>
> missing is the observed omission of the item (1 = omitted)
> missRand is a binomial random with probability = .5 I created
> missReg is a biomial random with probability = item index number / 11,
> i.e. strong relationship with item
> riskItem identifies the risk items (item 4)
> posItem identifies the positively cued items (3, 5 and 10)
>
> For this particular dataset there are 237 participants.  (Like Groucho
> Marks and principles, I have
> others if you don't like this one!)
>
> The catch is that each participant has only completed the measure once.
> It is a plausible model that
> participants will vary in their propensity to omit items, it is also
> highly plausible that participants
> might omit the risk item more than the other item. It's a question of
> interest (though I think not of
> huge interest nor would I bet on it (!) that the positively cued items
> might be omitted more than the
> negatively cued ones or v.v.
>
> I thought I could test for these effects using glmer().  That works for
>
> > fitRisk <- glmer(missing ~ riskItem + (1 | ID), family = binomial, data
> = longDat)
> > summary(fitRisk)
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: missing ~ riskItem + (1 | ID)
>    Data: longDat
>
>      AIC      BIC   logLik deviance df.resid
>     51.7     69.0    -22.9     45.7     2367
>
> Scaled residuals:
>      Min       1Q   Median       3Q      Max
> -0.52020 -0.00113 -0.00113 -0.00113  2.93829
>
> Random effects:
>  Groups Name        Variance Std.Dev.
>  ID     (Intercept) 205.3    14.33
> Number of obs: 2370, groups:  ID, 237
>
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -13.563      2.540  -5.339 9.33e-08 ***
> riskItem      -2.418      3.745  -0.646    0.518
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>          (Intr)
> riskItem 0.064
>
> No effect but small dataset and test of principle works.  However, when I
> come to test the overall difference by items I hit
>
> 5       NA     NA     NA     NA     NA     NA     NA     NA     NA     NA
>    NA     NA     NA     NA     NA     NA
>   BITE23 BITE24 BITE25 BITE26 BITE27 BITE28 BITE29 BITE30 BITE31 BITE32
> BITE33 Notas origRow Centro2 transfer one
> 1     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA
>  NA  <NA>      43  Urgell        0   1
> 2     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA
>  NA  <NA>      68  Urgell        0   1
> 3     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA
>  NA  <NA>      69  Urgell        0   1
> 4     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA
>  NA  <NA>      70  Urgell        0   1
> 5     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA
>  NA  <NA>      71  Urgell        0   1
>   Exclude quaireN ID.quaireN Nquaires  ID.episodeN episodeN nEpisodes
> Fecha.inicio  Fecha.fin ExpCOREOM ExpSFA ExpSFB
> 1       0       5 URGE0003-5       27 URGE0003-001        1         1
>  2017-11-13 2019-02-12        NA     NA     NA
> 2       0       3 URGE0004-3       36 URGE0004-001        1         2
>  2017-11-21 2018-09-10        NA     NA     NA
> 3       0       4 URGE0004-4       36 URGE0004-001        1         2
>  2017-11-21 2018-09-10        NA     NA     NA
> 4       0       5 URGE0004-5       36 URGE0004-001        1         2
>  2017-11-21 2018-09-10        NA     NA     NA
> 5       0       6 URGE0004-6       36 URGE0004-001        1         2
>  2017-11-21 2018-09-10        NA     NA     NA
>   ExpYPCORE ExpEAT ExpBITE missingCOREOM missingSFA missingSFB missingYP
> missingEAT missingBITE impossCOREOM impossSFA
> 1        NA     NA      NA            NA         NA         NA        NA
>        NA          NA           NA        NA
> 2        NA     NA      NA            NA         NA         NA        NA
>        NA          NA           NA        NA
> 3        NA     NA      NA            NA         NA         NA        NA
>        NA          NA           NA        NA
> 4        NA     NA      NA            NA         NA         NA        NA
>        NA          NA           NA        NA
> 5        NA     NA      NA            NA         NA         NA        NA
>        NA          NA           NA        NA
>   impossSFB impossYP impossEAT impossBITE
> 1        NA       NA        NA         NA
> 2        NA       NA        NA         NA
> 3        NA       NA        NA         NA
> 4        NA       NA        NA         NA
> 5        NA       NA        NA         NA
>  [ reached 'max' / getOption("max.print") -- omitted 672 rows ]
> > datPrincip_URGE %>%
> +   arrange(Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   dim()
> [1] 677 195
> > datPrincip_URGE %>%
> +   arrange(Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +     dim()
> [1]  61 197
> > dat <- bind_rows(datPrincip_ALCA,
> +                  datPrincip_URGE)
> > dat %>%
> +   arrange(Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +     dim()
> [1] 118 197
> > dat <- bind_rows(datPrincip_ALCA,
> +                  datPrincip_ARGE,
> +                  datPrincip_AVEN,
> +                  datPrincip_MOSC,
> +                  datPrincip_SABA,
> +                  datPrincip_SEVI,
> +                  datPrincip_TARR,
> +                  datPrincip_URGE)
> > dat %>%
> +   arrange(Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +     dim()
> [1] 237 197
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +     dim()
> [1] 237 197
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, starts_with("YP")) %>%
> +   dim()
> [1] 237  11
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, starts_with("YP")) %>%
> +   pivot_longer(cols = starts_with("YPCORE")) %>%
> +   dim()
> [1] 2370    3
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, starts_with("YP")) %>%
> +   pivot_longer(cols = starts_with("YPCORE")) %>%
> +   mutate(missing = if_else(!is.na(value), 0, 1)) -> longDat
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [1]
>   ID       name     value missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0001 YPCORE03     0       0
> 4 ALCA0001 YPCORE04     0       0
> 5 ALCA0001 YPCORE05     0       0
> 6 ALCA0001 YPCORE06     1       0
> > table(longDat$missing)
>
>    0    1
> 2356   14
> > ?pivot_longer
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, starts_with("YP")) %>%
> +   pivot_longer(cols = starts_with("YPCORE"),
> +                names_to = "Item") %>%
> +   mutate(missing = if_else(!is.na(value), 0, 1)) -> longDat
> > head(longD)
> Error in head(longD) : object 'longD' not found
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [1]
>   ID       Item     value missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0001 YPCORE03     0       0
> 4 ALCA0001 YPCORE04     0       0
> 5 ALCA0001 YPCORE05     0       0
> 6 ALCA0001 YPCORE06     1       0
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, starts_with("YP")) %>%
> +   pivot_longer(cols = starts_with("YPCORE"),
> +                names_to = "Item",
> +                values_to = "Score") %>%
> +   mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [1]
>   ID       Item     Score missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0001 YPCORE03     0       0
> 4 ALCA0001 YPCORE04     0       0
> 5 ALCA0001 YPCORE05     0       0
> 6 ALCA0001 YPCORE06     1       0
> > formul1 <- missing ~ Item + Item | ID
> > fit <- glmer(formul1, family = binomial, data = longDat)
> Error in glmer(formul1, family = binomial, data = longDat) :
>   could not find function "glmer"
> > library(lme4)
> Loading required package: Matrix
>
> Attaching package: ‘Matrix’
>
> The following objects are masked from ‘package:tidyr’:
>
>     expand, pack, unpack
>
> > formul1 <- missing ~ Item + Item | ID
> > fit <- glmer(formul1, family = binomial, data = longDat)
> boundary (singular) fit: see ?isSingular
> Warning messages:
> 1: In commonArgs(par, fn, control, environment()) :
>   maxfun < 10 * length(par)^2 is not recommended.
> 2: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,
> :
>   failure to converge in 10000 evaluations
> > formul1 <- missing ~ Item | ID
> > fit <- glmer(formul1, family = binomial, data = longDat)
> boundary (singular) fit: see ?isSingular
> Warning messages:
> 1: In commonArgs(par, fn, control, environment()) :
>   maxfun < 10 * length(par)^2 is not recommended.
> 2: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,
> :
>   failure to converge in 10000 evaluations
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [1]
>   ID       Item     Score missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0001 YPCORE03     0       0
> 4 ALCA0001 YPCORE04     0       0
> 5 ALCA0001 YPCORE05     0       0
> 6 ALCA0001 YPCORE06     1       0
> > ?glmer
> > ?mcnemar.test
> > table(is.na(dat$YPCORE01), is.na(dat$YPCORE02))
>
>         FALSE TRUE
>   FALSE  3573    4
>   TRUE      1 9499
> > mcnemar.test(table(is.na(dat$YPCORE01), is.na(dat$YPCORE02)))
>
>         McNemar's Chi-squared test with continuity correction
>
> data:  table(is.na(dat$YPCORE01), is.na(dat$YPCORE02))
> McNemar's chi-squared = 0.8, df = 1, p-value = 0.3711
>
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, starts_with("YP")) %>%
> +   pivot_longer(cols = c("YPCORE01", "YPCORE02")),
> Error: unexpected ',' in:
> "  select(ID, starts_with("YP")) %>%
>   pivot_longer(cols = c("YPCORE01", "YPCORE02")),"
> >                names_to = "Item",
> Error: unexpected ',' in "               names_to = "Item","
> >                values_to = "Score") %>%
> Error: unexpected ')' in "               values_to = "Score")"
> >   mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> Error in if_else(!is.na(Score), 0, 1) : object 'Score' not found
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [1]
>   ID       Item     Score missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0001 YPCORE03     0       0
> 4 ALCA0001 YPCORE04     0       0
> 5 ALCA0001 YPCORE05     0       0
> 6 ALCA0001 YPCORE06     1       0
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, starts_with("YP")) %>%
> +   pivot_longer(cols = c("YPCORE01", "YPCORE02"),
> +                names_to = "Item",
> +                values_to = "Score") %>%
> +   mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> > head(longDat)
> # A tibble: 6 x 12
> # Groups:   ID [3]
>   ID       YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07 YPCORE08 YPCORE09
> YPCORE10 Item     Score missing
>   <chr>       <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
>   <dbl> <chr>    <dbl>   <dbl>
> 1 ALCA0001        0        0        0        1        0        0        0
>       0 YPCORE01     1       0
> 2 ALCA0001        0        0        0        1        0        0        0
>       0 YPCORE02     0       0
> 3 ALCA0004        0        0        0        1        0        0        0
>       0 YPCORE01     1       0
> 4 ALCA0004        0        0        0        1        0        0        0
>       0 YPCORE02     0       0
> 5 ALCA0007        1        0        0        1        0        2        1
>       0 YPCORE01     2       0
> 6 ALCA0007        1        0        0        1        0        2        1
>       0 YPCORE02     1       0
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, c("YPCORE01", "YPCORE02")) %>%
> +   pivot_longer(cols = c("YPCORE01", "YPCORE02"),
> +                names_to = "Item",
> +                values_to = "Score") %>%
> +   mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [3]
>   ID       Item     Score missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0004 YPCORE01     1       0
> 4 ALCA0004 YPCORE02     0       0
> 5 ALCA0007 YPCORE01     2       0
> 6 ALCA0007 YPCORE02     1       0
> > fit <- glmer(formul1, family = binomial, data = longDat)
> boundary (singular) fit: see ?isSingular
> > ?isSingular
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [3]
>   ID       Item     Score missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0004 YPCORE01     1       0
> 4 ALCA0004 YPCORE02     0       0
> 5 ALCA0007 YPCORE01     2       0
> 6 ALCA0007 YPCORE02     1       0
> > table(longDat$missing)
>
>   0   1
> 472   2
> > library(lme4)
> > formul1 <- missing ~ (Item | ID)
> > fit <- glmer(formul1, family = binomial, data = longDat)
> boundary (singular) fit: see ?isSingular
> > longDat$missRand <- rbinom(1, length(longDat), .5)
> > head(longDat)
> # A tibble: 6 x 5
> # Groups:   ID [3]
>   ID       Item     Score missing missRand
>   <chr>    <chr>    <dbl>   <dbl>    <int>
> 1 ALCA0001 YPCORE01     1       0        2
> 2 ALCA0001 YPCORE02     0       0        2
> 3 ALCA0004 YPCORE01     1       0        2
> 4 ALCA0004 YPCORE02     0       0        2
> 5 ALCA0007 YPCORE01     2       0        2
> 6 ALCA0007 YPCORE02     1       0        2
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 0
> > ?rbinom
> > rbinom(1,1,.05)
> [1] 0
> > rbinom(1,1,.05)
> [1] 0
> > rbinom(1,1,.95)
> [1] 1
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.95)
> [1] 1
> > rbinom(1,1,.5)
> [1] 1
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 1
> > rbinom(1,1,.5)
> [1] 0
> > rbinom(1,1,.5)
> [1] 1
> > rbinom(1,2,.5)
> [1] 2
> > rbinom(2,1,.5)
> [1] 0 0
> > rbinom(2,1,.5)
> [1] 0 0
> > rbinom(2,1,.5)
> [1] 0 0
> > rbinom(2,1,.5)
> [1] 0 1
> > rbinom(2,1,.5)
> [1] 1 1
> > rbinom(2,1,.5)
> [1] 0 1
> > longDat$missRand <- rbinom(length(longDat), 1, .5)
> Error: Assigned data `rbinom(length(longDat), 1, 0.5)` must be compatible
> with existing data.
> x Existing data has 474 rows.
> x Assigned data has 5 rows.
> ℹ Only vectors of size 1 are recycled.
> Run `rlang::last_error()` to see where the error occurred.
> > table(longDat$missing)
>
>   0   1
> 472   2
> > library(lme4)
> > head(longDat)
> # A tibble: 6 x 5
> # Groups:   ID [3]
>   ID       Item     Score missing missRand
>   <chr>    <chr>    <dbl>   <dbl>    <int>
> 1 ALCA0001 YPCORE01     1       0        2
> 2 ALCA0001 YPCORE02     0       0        2
> 3 ALCA0004 YPCORE01     1       0        2
> 4 ALCA0004 YPCORE02     0       0        2
> 5 ALCA0007 YPCORE01     2       0        2
> 6 ALCA0007 YPCORE02     1       0        2
> > rbinom(2,1,.5)
> [1] 1 0
> > rbinom(2,1,.5)
> [1] 1 1
> > rbinom(2,1,.5)
> [1] 1 1
> > longDat$missRand <- NULL
> > longDat$missRand <- rbinom(length(longDat), 1, .5)
> Error: Assigned data `rbinom(length(longDat), 1, 0.5)` must be compatible
> with existing data.
> x Existing data has 474 rows.
> x Assigned data has 4 rows.
> ℹ Only vectors of size 1 are recycled.
> Run `rlang::last_error()` to see where the error occurred.
> > longDat$missRand <- rbinom(nrow(longDat), 1, .5)
> > table(longDat$missRand)
>
>   0   1
> 240 234
> > head(longDat)
> # A tibble: 6 x 5
> # Groups:   ID [3]
>   ID       Item     Score missing missRand
>   <chr>    <chr>    <dbl>   <dbl>    <int>
> 1 ALCA0001 YPCORE01     1       0        0
> 2 ALCA0001 YPCORE02     0       0        1
> 3 ALCA0004 YPCORE01     1       0        0
> 4 ALCA0004 YPCORE02     0       0        1
> 5 ALCA0007 YPCORE01     2       0        0
> 6 ALCA0007 YPCORE02     1       0        0
> > formul1 <- missRand ~ (Item | ID)
> > fit <- glmer(formul1, family = binomial, data = longDat)
> boundary (singular) fit: see ?isSingular
> > fit <- glmer(missRand ~ Item, family = binomial, data = longDat)
> Error: No random effects terms specified in formula
> > fit <- glmer(missRand ~ Item | ID, family = binomial, data = longDat)
> boundary (singular) fit: see ?isSingular
> > fit <- glmer(factor(missRand) ~ Item | ID, family = binomial, data =
> longDat)
> boundary (singular) fit: see ?isSingular
> > fit <- glmer(missRand ~ Item | ID, family = "logistic", data = longDat)
> Error in get(family, mode = "function", envir = parent.frame(2)) :
>   object 'logistic' of mode 'function' was not found
> > fit <- glmer(missRand ~ Item + (1 | ID), family = "binomial", data =
> longDat)
> boundary (singular) fit: see ?isSingular
> > ?glmer
> > longDat
> # A tibble: 474 x 5
> # Groups:   ID [237]
>    ID       Item     Score missing missRand
>    <chr>    <chr>    <dbl>   <dbl>    <int>
>  1 ALCA0001 YPCORE01     1       0        0
>  2 ALCA0001 YPCORE02     0       0        1
>  3 ALCA0004 YPCORE01     1       0        0
>  4 ALCA0004 YPCORE02     0       0        1
>  5 ALCA0007 YPCORE01     2       0        0
>  6 ALCA0007 YPCORE02     1       0        0
>  7 ALCA0012 YPCORE01     3       0        1
>  8 ALCA0012 YPCORE02     1       0        1
>  9 ALCA0013 YPCORE01     4       0        1
> 10 ALCA0013 YPCORE02     3       0        1
> # … with 464 more rows
> > table(longDat$missRand, longDat$Item)
>
>     YPCORE01 YPCORE02
>   0      125      115
>   1      112      122
> >
>
> > fit <- glmer(missRand ~ factor(Item) + (1 | ID), family = "binomial",
> data = longDat)
> boundary (singular) fit: see ?isSingular
> > fit <- glmer(missRand ~ factor(Item) + (1 | factor(ID)), family =
> "binomial", data = longDat)
> Error: couldn't evaluate grouping factor factor(ID) within model frame:
> try adding grouping factor to data frame explicitly if possible
> > fit <- glmer(missRand ~ factor(Item) + (1 | ID), family = "binomial",
> data = longDat)
> boundary (singular) fit: see ?isSingular
> > rm(fit)
> > fit <- glmer(missRand ~ factor(Item) + (1 | ID), family = "binomial",
> data = longDat)
> boundary (singular) fit: see ?isSingular
> > fit
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: missRand ~ factor(Item) + (1 | ID)
>    Data: longDat
>       AIC       BIC    logLik  deviance  df.resid
>  662.1833  674.6669 -328.0917  656.1833       471
> Random effects:
>  Groups Name        Std.Dev.
>  ID     (Intercept) 1.943e-07
> Number of obs: 474, groups:  ID, 237
> Fixed Effects:
>          (Intercept)  factor(Item)YPCORE02
>              -0.1098                0.1689
> convergence code 0; 0 optimizer warnings; 1 lme4 warnings
> > dat %>%
> +   arrange(ID, Fecha.evaluacion) %>% # sort by evaluation date
> +    filter(Cuestionarios == "YP-CORE" &
> +            Codigo.datos == "RESPONDIO CORRECTAMENTE") %>%
> +   group_by(ID) %>%
> +   mutate(occasion = row_number(),
> +          first = if_else(occasion == 1, 1, 0)) %>% # to get first
> ocassion only
> +   filter(first == 1) %>%
> +   select(ID, starts_with("YPCORE")) %>%
> +   pivot_longer(cols = starts_with("YPCORE"),
> +                names_to = "Item",
> +                values_to = "Score") %>%
> +   mutate(missing = if_else(!is.na(Score), 0, 1)) -> longDat
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [1]
>   ID       Item     Score missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0001 YPCORE03     0       0
> 4 ALCA0001 YPCORE04     0       0
> 5 ALCA0001 YPCORE05     0       0
> 6 ALCA0001 YPCORE06     1       0
> > formul1 <- missing ~ (Item | ID)
> > fit <- glmer(missing ~ factor(Item) + (1 | ID), family = "binomial",
> data = longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 1.78883 (tol = 0.002,
> component 1)
> > fit <- glmer((score == 1) ~ factor(Item) + (1 | ID), family =
> "binomial", data = longDat)
> Error in eval(predvars, data, env) : object 'score' not found
> > fit <- glmer((Score == 1) ~ factor(Item) + (1 | ID), family =
> "binomial", data = longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 0.00309555 (tol = 0.002,
> component 1)
> > fit <- glmer((Score == 1) ~ factor(Item) + (factor(Item) | ID), family =
> "binomial", data = longDat)
> Error: number of observations (=2356) < number of random effects (=2360)
> for term (factor(Item) | ID); the random-effects parameters are probably
> unidentifiable
> > fit <- glmer((Score == 1) ~ factor(Item) + (Item | ID), family =
> "binomial", data = longDat)
> Error: number of observations (=2356) < number of random effects (=2360)
> for term (Item | ID); the random-effects parameters are probably
> unidentifiable
> > fit <- glmer((Score == 1) ~ factor(Item) + (1 | ID), family =
> "binomial", data = longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 0.00309555 (tol = 0.002,
> component 1)
> > head(longDat)
> # A tibble: 6 x 4
> # Groups:   ID [1]
>   ID       Item     Score missing
>   <chr>    <chr>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0
> 2 ALCA0001 YPCORE02     0       0
> 3 ALCA0001 YPCORE03     0       0
> 4 ALCA0001 YPCORE04     0       0
> 5 ALCA0001 YPCORE05     0       0
> 6 ALCA0001 YPCORE06     1       0
> > longDat$missRand <- rbinom(nrow(longDat), 1, .5)
> > table(longDat$missRand)
>
>    0    1
> 1211 1159
> > fit <- glmer(missRand ~ Item + (1 | ID), family = "binomial", data =
> longDat)
> boundary (singular) fit: see ?isSingular
> > fit
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: missRand ~ Item + (1 | ID)
>    Data: longDat
>       AIC       BIC    logLik  deviance  df.resid
>  3300.819  3364.296 -1639.410  3278.819      2359
> Random effects:
>  Groups Name        Std.Dev.
>  ID     (Intercept) 2.663e-07
> Number of obs: 2370, groups:  ID, 237
> Fixed Effects:
>  (Intercept)  ItemYPCORE02  ItemYPCORE03  ItemYPCORE04  ItemYPCORE05
> ItemYPCORE06  ItemYPCORE07  ItemYPCORE08
>   -5.909e-02     1.013e-01    -1.690e-02     1.182e-01     6.753e-02
> -1.186e-01    -1.046e-15    -1.186e-01
> ItemYPCORE09  ItemYPCORE10
>    1.858e-01    -6.766e-02
> convergence code 0; 0 optimizer warnings; 1 lme4 warnings
> > anova(fit)
> Analysis of Variance Table
>      npar Sum Sq Mean Sq F value
> Item    9 5.5453 0.61615  0.6161
> > table(longDat$Item, as.numeric(longDat$Item))
> < table of extent 10 x 0 >
> Warning message:
> In table(longDat$Item, as.numeric(longDat$Item)) :
>   NAs introduced by coercion
> > dim(longDat)
> [1] 2370    5
> > head(longDat)
> # A tibble: 6 x 5
> # Groups:   ID [1]
>   ID       Item     Score missing missRand
>   <chr>    <chr>    <dbl>   <dbl>    <int>
> 1 ALCA0001 YPCORE01     1       0        1
> 2 ALCA0001 YPCORE02     0       0        1
> 3 ALCA0001 YPCORE03     0       0        1
> 4 ALCA0001 YPCORE04     0       0        1
> 5 ALCA0001 YPCORE05     0       0        1
> 6 ALCA0001 YPCORE06     1       0        0
> > table(longDat$Item) #, as.numeric(longDat$Item))
>
> YPCORE01 YPCORE02 YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07 YPCORE08
> YPCORE09 YPCORE10
>      237      237      237      237      237      237      237      237
>   237      237
> > table(as.numeric(longDat$Item)) #, as.numeric(longDat$Item))
> < table of extent 0 >
> Warning message:
> In table(as.numeric(longDat$Item)) : NAs introduced by coercion
> > table(longDat$Item, as.numeric(factor(longDat$Item)))
>
>              1   2   3   4   5   6   7   8   9  10
>   YPCORE01 237   0   0   0   0   0   0   0   0   0
>   YPCORE02   0 237   0   0   0   0   0   0   0   0
>   YPCORE03   0   0 237   0   0   0   0   0   0   0
>   YPCORE04   0   0   0 237   0   0   0   0   0   0
>   YPCORE05   0   0   0   0 237   0   0   0   0   0
>   YPCORE06   0   0   0   0   0 237   0   0   0   0
>   YPCORE07   0   0   0   0   0   0 237   0   0   0
>   YPCORE08   0   0   0   0   0   0   0 237   0   0
>   YPCORE09   0   0   0   0   0   0   0   0 237   0
>   YPCORE10   0   0   0   0   0   0   0   0   0 237
> > longDat %>%
> +   mutate(missRand = rbinom(1, 1, .5))
> # A tibble: 2,370 x 5
> # Groups:   ID [237]
>    ID       Item     Score missing missRand
>    <chr>    <chr>    <dbl>   <dbl>    <int>
>  1 ALCA0001 YPCORE01     1       0        1
>  2 ALCA0001 YPCORE02     0       0        1
>  3 ALCA0001 YPCORE03     0       0        1
>  4 ALCA0001 YPCORE04     0       0        1
>  5 ALCA0001 YPCORE05     0       0        1
>  6 ALCA0001 YPCORE06     1       0        1
>  7 ALCA0001 YPCORE07     0       0        1
>  8 ALCA0001 YPCORE08     0       0        1
>  9 ALCA0001 YPCORE09     0       0        1
> 10 ALCA0001 YPCORE10     0       0        1
> # … with 2,360 more rows
> > table(longDat$missRand)
>
>    0    1
> 1211 1159
> > longDat %>%
> +   mutate(missRand = rbinom(1, 1, .5),
> +          missReg = rbinom(1, 1, as.numeric(factor(Item))/11))
> # A tibble: 2,370 x 6
> # Groups:   ID [237]
>    ID       Item     Score missing missRand missReg
>    <chr>    <chr>    <dbl>   <dbl>    <int>   <int>
>  1 ALCA0001 YPCORE01     1       0        1       0
>  2 ALCA0001 YPCORE02     0       0        1       0
>  3 ALCA0001 YPCORE03     0       0        1       0
>  4 ALCA0001 YPCORE04     0       0        1       0
>  5 ALCA0001 YPCORE05     0       0        1       0
>  6 ALCA0001 YPCORE06     1       0        1       0
>  7 ALCA0001 YPCORE07     0       0        1       0
>  8 ALCA0001 YPCORE08     0       0        1       0
>  9 ALCA0001 YPCORE09     0       0        1       0
> 10 ALCA0001 YPCORE10     0       0        1       0
> # … with 2,360 more rows
> > table(longDat$missReg)
> < table of extent 0 >
> Warning message:
> Unknown or uninitialised column: `missReg`.
> > longDat %>%
> +   mutate(missRand = rbinom(1, 1, .5),
> +          missReg = rbinom(1, 1, as.numeric(factor(Item))/11)) -> longDat
> > table(longDat$missReg)
>
>    0    1
> 2120  250
> > fit <- glmer(missReg ~ Item + (1 | ID), family = "binomial", data =
> longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 3.81902 (tol = 0.002,
> component 1)
> > longDat %>%
> +   mutate(missRand = rbinom(1, 1, .5),
> +          missReg = rbinom(1, 1, as.numeric(factor(Item))/11)) -> longDat
> > head(longDat)
> # A tibble: 6 x 6
> # Groups:   ID [1]
>   ID       Item     Score missing missRand missReg
>   <chr>    <chr>    <dbl>   <dbl>    <int>   <int>
> 1 ALCA0001 YPCORE01     1       0        1       1
> 2 ALCA0001 YPCORE02     0       0        1       1
> 3 ALCA0001 YPCORE03     0       0        1       1
> 4 ALCA0001 YPCORE04     0       0        1       1
> 5 ALCA0001 YPCORE05     0       0        1       1
> 6 ALCA0001 YPCORE06     1       0        1       1
> > table(longDat$missRand)
>
>    0    1
> 1140 1230
> > table(longDat$missReg)
>
>    0    1
> 2160  210
> > library(lme4)
> > fitRand <- glmer(missRand ~ Item + (1 | ID), family = "binomial", data =
> longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 5.46528 (tol = 0.002,
> component 1)
> > anova(fitRand)
> Analysis of Variance Table
>      npar   Sum Sq   Mean Sq F value
> Item    9 0.050372 0.0055969  0.0056
> > length(unique(longDat$ID))
> [1] 237
> > rm(fitRand)
> > anova(fitRand)
> Error in anova(fitRand) : object 'fitRand' not found
> > fitRand <- glmer(missRand ~ Item + (1 | ID), family = "binomial", data =
> longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 5.46528 (tol = 0.002,
> component 1)
> > fitRand <- glmer(missRand ~ Item + (1 | ID), family = binomial, data =
> longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 5.46528 (tol = 0.002,
> component 1)
> > fitRand <- glmer(missRand ~ Item + (1 | ID), family = binomial(link =
> "logit"), data = longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 5.46528 (tol = 0.002,
> component 1)
> > ?glmer
> > data("cbpp")
> > dim(cbpp)
> [1] 56  4
> > head(cbpp)
>   herd incidence size period
> 1    1         2   14      1
> 2    1         3   12      2
> 3    1         4    9      3
> 4    1         0    5      4
> 5    2         3   22      1
> 6    2         1   18      2
> > fitRand <- glmer(cbind(missRand, 1) ~ Item + (1 | ID), family =
> binomial(link = "logit"), data = longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 0.00550177 (tol = 0.002,
> component 1)
> > fitReg <- glmer(cbind(missReg, 1) ~ Item + (1 | ID), family =
> "binomial", data = longDat)
> Warning messages:
> 1: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,
> :
>   failure to converge in 10000 evaluations
> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 1.47499 (tol = 0.002,
> component 1)
> > fitReal <- glmer(cbind(missing, 1) ~ Item + (1 | ID), family =
> "binomial", data = longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 1.29013 (tol = 0.002,
> component 1)
> > data(mlmdata)
> Warning message:
> In data(mlmdata) : data set ‘mlmdata’ not found
> > library(haven)
> > mlmdata <- read_dta("
> https://stats.idre.ucla.edu/stat/examples/imm/imm10.dta")
> > head(mlmdata)
> # A tibble: 6 x 19
>   schid stuid    ses meanses homework white parented public ratio percmin
> math   sex  race sctype  cstr scsize urban
>   <dbl> <dbl>  <dbl>   <dbl>    <dbl> <dbl>    <dbl>  <dbl> <dbl>   <dbl>
> <dbl> <dbl> <dbl>  <dbl> <dbl>  <dbl> <dbl>
> 1  7472     3 -0.130  -0.483        1     1        2      1    19       0
>   48     2     4      1     2      3     2
> 2  7472     8 -0.390  -0.483        0     1        2      1    19       0
>   48     1     4      1     2      3     2
> 3  7472    13 -0.800  -0.483        0     1        2      1    19       0
>   53     1     4      1     2      3     2
> 4  7472    17 -0.720  -0.483        1     1        2      1    19       0
>   42     1     4      1     2      3     2
> 5  7472    27 -0.740  -0.483        2     1        2      1    19       0
>   43     2     4      1     2      3     2
> 6  7472    28 -0.580  -0.483        1     1        2      1    19       0
>   57     2     4      1     2      3     2
> # … with 2 more variables: region <dbl>, schnum <dbl>
> > model <- glmer(white ~ homework + (1 + homework | schid), data=mlmdata,
> +                family=binomial(link="logit"))
> boundary (singular) fit: see ?isSingular
> > summary(model)
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: white ~ homework + (1 + homework | schid)
>    Data: mlmdata
>
>      AIC      BIC   logLik deviance df.resid
>    182.4    200.2    -86.2    172.4      255
>
> Scaled residuals:
>     Min      1Q  Median      3Q     Max
> -4.3373 -0.1184  0.1112  0.3421  3.8801
>
> Random effects:
>  Groups Name        Variance Std.Dev. Corr
>  schid  (Intercept) 16.28745 4.0358
>         homework     0.04678 0.2163   -1.00
> Number of obs: 260, groups:  schid, 10
>
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)  1.67365    1.47324   1.136    0.256
> homework     0.04508    0.18433   0.245    0.807
>
> Correlation of Fixed Effects:
>          (Intr)
> homework -0.590
> convergence code: 0
> boundary (singular) fit: see ?isSingular
>
> > dim(mlmdata)
> [1] 260  19
> > head(mlmdata)
> # A tibble: 6 x 19
>   schid stuid    ses meanses homework white parented public ratio percmin
> math   sex  race sctype  cstr scsize urban
>   <dbl> <dbl>  <dbl>   <dbl>    <dbl> <dbl>    <dbl>  <dbl> <dbl>   <dbl>
> <dbl> <dbl> <dbl>  <dbl> <dbl>  <dbl> <dbl>
> 1  7472     3 -0.130  -0.483        1     1        2      1    19       0
>   48     2     4      1     2      3     2
> 2  7472     8 -0.390  -0.483        0     1        2      1    19       0
>   48     1     4      1     2      3     2
> 3  7472    13 -0.800  -0.483        0     1        2      1    19       0
>   53     1     4      1     2      3     2
> 4  7472    17 -0.720  -0.483        1     1        2      1    19       0
>   42     1     4      1     2      3     2
> 5  7472    27 -0.740  -0.483        2     1        2      1    19       0
>   43     2     4      1     2      3     2
> 6  7472    28 -0.580  -0.483        1     1        2      1    19       0
>   57     2     4      1     2      3     2
> # … with 2 more variables: region <dbl>, schnum <dbl>
> > length(unique(mlmdata$schid))
> [1] 10
> > table(white)
> Error in table(white) : object 'white' not found
> > table(mlmdata$white)
>
>   0   1
>  71 189
> > fitRand <- glmer(missRand ~ Item + (1 | ID), family = binomial(link =
> "logit"), data = longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 5.46528 (tol = 0.002,
> component 1)
> > ,
> Error: unexpected ',' in ","
> > longDat %>%
> +   mutate(missRand = rbinom(1, 1, .5),
> +          missReg = rbinom(1, 1, as.numeric(factor(Item))/11),
> +          riskItem = if_else(Item == "YPCORE04", 1, 0),
> +          posItem = if_else(Item %in% c("YPCORE03", "YPCORE5",
> "YPCORE10"), 1, 0)) -> longDat
> > head(longDat)
> # A tibble: 6 x 8
> # Groups:   ID [1]
>   ID       Item     Score missing missRand missReg riskItem posItem
>   <chr>    <chr>    <dbl>   <dbl>    <int>   <int>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0        0       0        0       0
> 2 ALCA0001 YPCORE02     0       0        0       0        0       0
> 3 ALCA0001 YPCORE03     0       0        0       0        0       1
> 4 ALCA0001 YPCORE04     0       0        0       0        1       0
> 5 ALCA0001 YPCORE05     0       0        0       0        0       0
> 6 ALCA0001 YPCORE06     1       0        0       0        0       0
> > table(longDat$riskItem, longDat$posItem)
>
>        0    1
>   0 1659  474
>   1  237    0
> > table(longDat$riskItem, longDat$Item)
>
>     YPCORE01 YPCORE02 YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07
> YPCORE08 YPCORE09 YPCORE10
>   0      237      237      237        0      237      237      237
> 237      237      237
>   1        0        0        0      237        0        0        0
> 0        0        0
> > table(longDat$posItem, longDat$Item)
>
>     YPCORE01 YPCORE02 YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07
> YPCORE08 YPCORE09 YPCORE10
>   0      237      237        0      237      237      237      237
> 237      237        0
>   1        0        0      237        0        0        0        0
> 0        0      237
> > longDat %>%
> +   mutate(missRand = rbinom(1, 1, .5),
> +          missReg = rbinom(1, 1, as.numeric(factor(Item))/11),
> +          riskItem = if_else(Item == "YPCORE04", 1, 0),
> +          posItem = if_else(Item %in% c("YPCORE03", "YPCORE05",
> "YPCORE10"), 1, 0)) -> longDat
> > head(longDat)
> # A tibble: 6 x 8
> # Groups:   ID [1]
>   ID       Item     Score missing missRand missReg riskItem posItem
>   <chr>    <chr>    <dbl>   <dbl>    <int>   <int>    <dbl>   <dbl>
> 1 ALCA0001 YPCORE01     1       0        0       0        0       0
> 2 ALCA0001 YPCORE02     0       0        0       0        0       0
> 3 ALCA0001 YPCORE03     0       0        0       0        0       1
> 4 ALCA0001 YPCORE04     0       0        0       0        1       0
> 5 ALCA0001 YPCORE05     0       0        0       0        0       1
> 6 ALCA0001 YPCORE06     1       0        0       0        0       0
> > table(longDat$posItem, longDat$Item)
>
>     YPCORE01 YPCORE02 YPCORE03 YPCORE04 YPCORE05 YPCORE06 YPCORE07
> YPCORE08 YPCORE09 YPCORE10
>   0      237      237        0      237        0      237      237
> 237      237        0
>   1        0        0      237        0      237        0        0
> 0        0      237
> > fitRisk <- glmer(missing ~ riskItem + (1 | ID), family = binomial, data
> = longDat)
> > summary(fitRisk)
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: missing ~ riskItem + (1 | ID)
>    Data: longDat
>
>      AIC      BIC   logLik deviance df.resid
>     51.7     69.0    -22.9     45.7     2367
>
> Scaled residuals:
>      Min       1Q   Median       3Q      Max
> -0.52020 -0.00113 -0.00113 -0.00113  2.93829
>
> Random effects:
>  Groups Name        Variance Std.Dev.
>  ID     (Intercept) 205.3    14.33
> Number of obs: 2370, groups:  ID, 237
>
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -13.563      2.540  -5.339 9.33e-08 ***
> riskItem      -2.418      3.745  -0.646    0.518
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>          (Intr)
> riskItem 0.064
> > fitRisk <- glmer(missing ~ posItem + (1 | ID), family = binomial, data =
> longDat)
> > fitRisk <- glmer(missing ~ riskItem + (1 | ID), family = binomial, data
> = longDat)
> > summary(fitRisk)
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: missing ~ riskItem + (1 | ID)
>    Data: longDat
>
>      AIC      BIC   logLik deviance df.resid
>     51.7     69.0    -22.9     45.7     2367
>
> Scaled residuals:
>      Min       1Q   Median       3Q      Max
> -0.52020 -0.00113 -0.00113 -0.00113  2.93829
>
> Random effects:
>  Groups Name        Variance Std.Dev.
>  ID     (Intercept) 205.3    14.33
> Number of obs: 2370, groups:  ID, 237
>
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -13.563      2.540  -5.339 9.33e-08 ***
> riskItem      -2.418      3.745  -0.646    0.518
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>          (Intr)
> riskItem 0.064
> > fitPos <- glmer(missing ~ posItem + (1 | ID), family = binomial, data =
> longDat)
> > summary(fitPos)
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: missing ~ posItem + (1 | ID)
>    Data: longDat
>
>      AIC      BIC   logLik deviance df.resid
>     52.4     69.7    -23.2     46.4     2367
>
> Scaled residuals:
>     Min      1Q  Median      3Q     Max
> -0.5111 -0.0012 -0.0012 -0.0010  3.4498
>
> Random effects:
>  Groups Name        Variance Std.Dev.
>  ID     (Intercept) 197.5    14.05
> Number of obs: 2370, groups:  ID, 237
>
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept) -13.5152     2.5280  -5.346 8.98e-08 ***
> posItem      -0.2953     1.2396  -0.238    0.812
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>         (Intr)
> posItem -0.110
> > head(longDat,15)
> # A tibble: 15 x 8
> # Groups:   ID [2]
>    ID       Item     Score missing missRand missReg riskItem posItem
>    <chr>    <chr>    <dbl>   <dbl>    <int>   <int>    <dbl>   <dbl>
>  1 ALCA0001 YPCORE01     1       0        0       0        0       0
>  2 ALCA0001 YPCORE02     0       0        0       0        0       0
>  3 ALCA0001 YPCORE03     0       0        0       0        0       1
>  4 ALCA0001 YPCORE04     0       0        0       0        1       0
>  5 ALCA0001 YPCORE05     0       0        0       0        0       1
>  6 ALCA0001 YPCORE06     1       0        0       0        0       0
>  7 ALCA0001 YPCORE07     0       0        0       0        0       0
>  8 ALCA0001 YPCORE08     0       0        0       0        0       0
>  9 ALCA0001 YPCORE09     0       0        0       0        0       0
> 10 ALCA0001 YPCORE10     0       0        0       0        0       1
> 11 ALCA0004 YPCORE01     1       0        0       0        0       0
> 12 ALCA0004 YPCORE02     0       0        0       0        0       0
> 13 ALCA0004 YPCORE03     0       0        0       0        0       1
> 14 ALCA0004 YPCORE04     0       0        0       0        1       0
> 15 ALCA0004 YPCORE05     0       0        0       0        0       1
> >
> >
> > rm(fitRand)
> > rm(fitReg)
> > rm(fitReal)
> > fitRand <- glmer(cbind(missRand, 1) ~ Item + (1 | ID), family =
> binomial(link = "logit"), data = longDat)
> Warning message:
> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 0.0501701 (tol = 0.002,
> component 1)
> > fitRand
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: cbind(missRand, 1) ~ Item + (1 | ID)
>    Data: longDat
>       AIC       BIC    logLik  deviance  df.resid
>  2215.017  2278.494 -1096.509  2193.017      2359
> Random effects:
>  Groups Name        Std.Dev.
>  ID     (Intercept) 2.509
> Number of obs: 2370, groups:  ID, 237
> Fixed Effects:
>  (Intercept)  ItemYPCORE02  ItemYPCORE03  ItemYPCORE04  ItemYPCORE05
> ItemYPCORE06  ItemYPCORE07  ItemYPCORE08
>    -2.512456      0.004360      0.007166      0.001288      0.004078
> 0.002322      0.005059      0.006118
> ItemYPCORE09  ItemYPCORE10
>     0.006454      0.005057
> convergence code 0; 0 optimizer warnings; 1 lme4 warnings
>
> So failed to converge but does give a set of estimates.  I think I can use
> the
>
>
> > fitReg <- glmer(cbind(missReg, 1) ~ Item + (1 | ID), family =
> "binomial", data = longDat)
> Warning messages:
> 1: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,
> :
>   failure to converge in 10000 evaluations
> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 1.66664 (tol = 0.002,
> component 1)
> > fitReg
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
>  Family: binomial  ( logit )
> Formula: cbind(missReg, 1) ~ Item + (1 | ID)
>    Data: longDat
>       AIC       BIC    logLik  deviance  df.resid
>  605.3404  668.8175 -291.6702  583.3404      2359
> Random effects:
>  Groups Name        Std.Dev.
>  ID     (Intercept) 6.775
> Number of obs: 2370, groups:  ID, 237
> Fixed Effects:
>  (Intercept)  ItemYPCORE02  ItemYPCORE03  ItemYPCORE04  ItemYPCORE05
> ItemYPCORE06  ItemYPCORE07  ItemYPCORE08
>      -8.6607       -0.3793       -0.3159       -0.3591       -0.4608
>  -0.4453       -0.4165       -0.6597
> ItemYPCORE09  ItemYPCORE10
>      -0.4304       -0.3535
> convergence code 0; 1 optimizer warnings; 1 lme4 warnings
>
> I am starting to realise the depths of my ignorance about all this.  It's
> clear to me that having only one observation per cell
> because of the nesting of items within participants and only having one
> completion per participant is causing the convergence
> issues (which makes sense though I have a sense that there might be ways
> to extract estimates despite this: this is beyond me!)
>
> I'm puzzled that I get slightly different results for the risk analysis if
> I use the "cbind(missing, 1) ~ " syntax in the formula
> from those I get using just "missing ~ " and different max|grad values
> from the nonconvergence message depending on that choice.
> I suspect that's a red herring.
>
> I have seen many comments here recently about non-convergence and ways to
> overcome it by specifying different methods of
> approximation (if I understood that properly) but they generally seemed to
> be for much more complex models than mine and
> probably weren't about this "cell size = 1" issue.  I have searched for
> answer or illumination for some hours and perhaps
> I'm asking the wrong questions but I'm stuck.  (I think it's very unlikely
> that I have access to trained statisticians by
> virtue of my honorary or paid academic positions!)
>
> So my questions:
> 1) _is_ there a way to estimate such a model, i.e. estimating an effect
> across all ten items with only one completion per participant?
> 2) can someone suggest good reading matter for a dogged non-statistician
> who can handle a lot of continuous variable issues up to
> some real complexity on his own but has never been as comfortable with
> counts beyond the basic old NHST, single level approaches?  I
> have an old copy of Pinheiro and Bates but confess that despite trying
> many times, the older it and I get, the less I'm able to cope
> with it.  The algebra is just beyond me.
>
> TIA ... and best wishes to all for physical safety and psychological
> resilience in these times,
>
> Chris
>
> --
> Small contribution in our coronavirus rigours:
>
> https://www.coresystemtrust.org.uk/home/free-options-to-replace-paper-core-forms-during-the-coronavirus-pandemic/
>
> Chris Evans <chris using psyctc.org> Visiting Professor, University of
> Sheffield <chris.evans using sheffield.ac.uk>
> I do some consultation work for the University of Roehampton <
> chris.evans using roehampton.ac.uk> and other places
> but <chris using psyctc.org> remains my main Email address.  I have a work web
> site at:
>    https://www.psyctc.org/psyctc/
> and a site I manage for CORE and CORE system trust at:
>    http://www.coresystemtrust.org.uk/
> I have "semigrated" to France, see:
>    https://www.psyctc.org/pelerinage2016/semigrating-to-france/
>
> https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/
>
> If you want an Emeeting, I am trying to keep them to Thursdays and my
> diary is at:
>    https://www.psyctc.org/pelerinage2016/ceworkdiary/
> Beware: French time, generally an hour ahead of UK.
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list