[R-sig-ME] Collinearity diagnostics for (mixed) multinomial models

Juho Kristian Ruohonen juho@kr|@t|@n@ruohonen @end|ng |rom gm@||@com
Tue Mar 1 14:24:42 CET 2022


Dear John (Fox, as well as other list members),

I've now written a simple function to try and calculate GVIFS for all
predictors in a nnet::multinom() object based on John's example code. If
its results are correct (see below), I will proceed to write a version that
also works with mixed-effects multinomial models fit by brms::brm(). Here's
the code:

gvif.multinom <- function(model){
>   (classes <- model$lev)
>   (V.all <- vcov(model))
>   (V.noIntercepts <- V.all[!grepl("\\(Intercept\\)$", rownames(V.all),
> perl = T),
>                            !grepl("\\(Intercept\\)$", colnames(V.all),
> perl = T)])
>   (R <- cov2cor(V.noIntercepts))
>   (terms <- attr(model$terms, "term.labels"))
>   (gvif <- numeric(length = length(terms)))
>   (names(gvif) <- terms)
>   (SE.multiplier <- numeric(length = length(terms)))
>   (names(SE.multiplier) <- terms)
>   #The line below tries to capture all factor levels into a regex for coef
> name matching.
>   (LevelsRegex <- paste0("(", paste(unlist(model$xlevels), collapse =
> "|"),")?"))
>
>   for(i in terms){
>     #The regex stuff below tries to ensure all interaction coefficients
> are matched, including those involving factors.
>     if(grepl(":", i)){
>       (termname <- gsub(":", paste0(LevelsRegex, ":"), i, perl = T))
>     }else{termname <- i}
>     (RegexToMatch <- paste0("^(", paste(classes[2:length(classes)],
> collapse = "|") ,"):", termname, LevelsRegex, "$"))
>
>     #Now the actual calculation:
>     (indices <- grep(RegexToMatch, rownames(R), perl = T))
>     (gvif[i] <- det(R[indices, indices]) * det(R[-indices, -indices]) /
> det(R))
>     (SE.multiplier[i] <- gvif[i]^(1/(2*length(indices))))
>   }
>   #Put the results together and order them by degree of SE inflation:
>   (result <- cbind(GVIF = gvif, `GVIF^(1/(2df))` = SE.multiplier))
>   return(result[order(result[,"GVIF^(1/(2df))"], decreasing = T),])}
>

The results seem correct to me when applied to John's example model fit to
the BEPS data. However, that dataset contains no multi-df factors, of which
my own models have many. Below is a maximally simple example with one
multi-df factor (*region*):

mod1 <- multinom(partic ~., data = carData::Womenlf)
> gvif.multinom(mod1)
>
> GVIF GVIF^(1/(2df))
> children 1.298794       1.067542
> hincome  1.184215       1.043176
> region   1.381480       1.020403
>

These results look plausible to me. Finally, below is an example involving
both a multi-df factor and an interaction:

mod2 <- update(mod1, ~. +children:region)
> gvif.multinom(mod2)
>
>                         GVIF GVIF^(1/(2df))
> children:region 4.965762e+16      11.053482
> region          1.420418e+16      10.221768
> children        1.471412e+03       6.193463
> hincome         6.462161e+00       1.594390
>

These results look a bit more dubious. To be sure, it is to be expected
that interaction terms will introduce a lot of collinearity. But an 11-fold
increase in SE? I hope someone can tell me whether this is correct or not!

Best,

Juho











ti 1. maalisk. 2022 klo 0.05 John Fox (jfox using mcmaster.ca) kirjoitti:

> Dear Juha,
>
> On 2022-02-28 5:00 p.m., Juho Kristian Ruohonen wrote:
> > Apologies for my misreading, John, and many thanks for showing how the
> > calculation is done for a single term.
> >
> > Do you think *vif()* might be updated in the near future with the
> > capability of auto-detecting a multinomial model and returning
> > mathematically correct GVIF statistics?
>
> The thought crossed my mind, but I'd want to do it in a general way, not
> just for the multinom() function, and in a way that avoids incorrect
> results such as those currently produced for "multinom" models, albeit
> with a warning. I can't guarantee whether or when I'll be able to do that.
>
> John
>
> >
> > If not, I'll proceed to writing my own function based on your example.
> > However, /car/ is such an excellent and widely used package that the
> > greatest benefit to mankind would probably accrue if /car /was upgraded
> > with this feature sooner rather than later.
> >
> > Best,
> >
> > Juho
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > ma 28. helmik. 2022 klo 17.08 John Fox (jfox using mcmaster.ca
> > <mailto:jfox using mcmaster.ca>) kirjoitti:
> >
> >     Dear Juho,
> >
> >     On 2022-02-28 2:06 a.m., Juho Kristian Ruohonen wrote:
> >      > Dear Professor Fox and other list members,
> >      >
> >      > Profuse thanks for doing that detective work for me! I myself
> >     thought
> >      > the inflation factors reported by check_collinearity() were
> >     suspiciously
> >      > high, but unlike you I lacked the expertise to identify what was
> >     going on.
> >      >
> >      > As for your suggested approach, have I understood this correctly:
> >      >
> >      > Since there doesn't yet exist an R function that will calculate
> the
> >      > (G)VIFS of multinomial models correctly, my best bet for now is
> >     just to
> >      > ignore the fact that such models partition the data into C-1
> >     subsets,
> >      > and to calculate approximate GVIFs from the entire dataset at
> >     once as if
> >      > the response were continuous? And a simple way to do this is to
> >      > construct a fake continuous response, call *lm(fakeresponse ~.)*,
> >     and
> >      > apply *car::vif()* on the result?
> >
> >     No, you misunderstand my suggestion, which perhaps isn't surprising
> >     given the length of my message. What you propose is what I suggested
> as
> >     a rough approximation *before* I confirmed that my guess of the
> >     solution
> >     was correct.
> >
> >     The R code that I sent yesterday showed how to compute the GVIF for a
> >     multinomial regression model, and I suggested that you write either a
> >     script or a simple function to do that. Here's a function that will
> >     work
> >     for a model object that responds to vcov():
> >
> >     GVIF <- function(model, intercepts, term){
> >         # model: regression model object
> >         # intercepts: row/column positions of intercepts in the
> coefficient
> >     covariance matrix
> >         # term: row/column positions of the coefficients for the focal
> term
> >         V <- vcov(model)
> >         term <- colnames(V)[term]
> >         V <- V[-intercepts, -intercepts]
> >         V <- cov2cor(V)
> >         term <- which(colnames(V) %in% term)
> >         gvif <- det(V[term, term])*det(V[-term, -term])/det(V)
> >         c(GVIF=gvif, "GVIF^(1/(2*p))"=gvif^(1/(2*length(term))))
> >     }
> >
> >     and here's an application to the multinom() example that I showed you
> >     yesterday:
> >
> >       > colnames(vcov(m)) # to get coefficient positions
> >        [1] "Labour:(Intercept)"                       "Labour:age"
> >
> >        [3] "Labour:economic.cond.national"
> >     "Labour:economic.cond.household"
> >        [5] "Labour:Blair"                             "Labour:Hague"
> >
> >        [7] "Labour:Kennedy"                           "Labour:Europe"
> >
> >        [9] "Labour:political.knowledge"               "Labour:gendermale"
> >
> >     [11] "Liberal Democrat:(Intercept)"             "Liberal
> Democrat:age"
> >
> >     [13] "Liberal Democrat:economic.cond.national"  "Liberal
> >     Democrat:economic.cond.household"
> >     [15] "Liberal Democrat:Blair"                   "Liberal
> >     Democrat:Hague"
> >
> >     [17] "Liberal Democrat:Kennedy"                 "Liberal
> >     Democrat:Europe"
> >     [19] "Liberal Democrat:political.knowledge"     "Liberal
> >     Democrat:gendermale"
> >
> >       > GVIF(m, intercepts=c(1, 11), term=c(2, 12)) # GVIF for age
> >                 GVIF GVIF^(1/(2*p))
> >             1.046232       1.011363
> >
> >
> >     Finally, here's what you get for a linear model with the same RHS
> >     (where
> >     the sqrt(VIF) should be a rough approximation to GVIF^(1/4) reported
> by
> >     my GVIF() function):
> >
> >       > m.lm <- lm(as.numeric(vote) ~ . - vote1, data=BEPS)
> >       > sqrt(car::vif(m.lm))
> >                           age  economic.cond.national
> >     economic.cond.household
> >                         Blair
> >                      1.006508                1.124132
> >     1.075656
> >                      1.118441
> >                         Hague                 Kennedy
> >     Europe
> >           political.knowledge
> >                      1.066799                1.015532
> >     1.101741
> >                      1.028546
> >                        gender
> >                      1.017386
> >
> >
> >     John
> >
> >      >
> >      > Best,
> >      >
> >      > Juho
> >      >
> >      > ma 28. helmik. 2022 klo 2.23 John Fox (jfox using mcmaster.ca
> >     <mailto:jfox using mcmaster.ca>
> >      > <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>) kirjoitti:
> >      >
> >      >     Dear Juho,
> >      >
> >      >     I've now had a chance to think about this problem some more,
> >     and I
> >      >     believe that the approach I suggested is correct. I also had
> an
> >      >     opportunity to talk the problem over a bit with Georges
> >     Monette, who
> >      >     coauthored the paper that introduced generalized variance
> >     inflation
> >      >     factors (GVIFs). On the other hand, the results produced by
> >      >     performance::check_collinearity() for multinomial logit
> >     models don't
> >      >     seem to be correct (see below).
> >      >
> >      >     Here's an example, using the nnet::multinom() function to fit
> a
> >      >     multinomial logit model, with alternative parametrizations of
> the
> >      >     LHS of
> >      >     the model:
> >      >
> >      >     --------- snip -----------
> >      >
> >      >       > library(nnet) # for multinom()
> >      >       > library(carData) # for BEPS data set
> >      >
> >      >       > # alternative ordering of the response levels:
> >      >       > BEPS$vote1 <- factor(BEPS$vote, levels=c("Labour",
> "Liberal
> >      >     Democrat", "Conservative"))
> >      >       > levels(BEPS$vote)
> >      >     [1] "Conservative"     "Labour"           "Liberal Democrat"
> >      >       > levels(BEPS$vote1)
> >      >     [1] "Labour"           "Liberal Democrat" "Conservative"
> >      >
> >      >       > m <- multinom(vote ~ . - vote1, data=BEPS)
> >      >     # weights:  33 (20 variable)
> >      >     initial  value 1675.383740
> >      >     iter  10 value 1345.935273
> >      >     iter  20 value 1150.956807
> >      >     iter  30 value 1141.921662
> >      >     iter  30 value 1141.921661
> >      >     iter  30 value 1141.921661
> >      >     final  value 1141.921661
> >      >     converged
> >      >       > m1 <- multinom(vote1 ~ . - vote, data=BEPS)
> >      >     # weights:  33 (20 variable)
> >      >     initial  value 1675.383740
> >      >     iter  10 value 1280.439304
> >      >     iter  20 value 1165.513772
> >      >     final  value 1141.921662
> >      >     converged
> >      >
> >      >       > rbind(coef(m), coef(m1)) # compare coefficients
> >      >                        (Intercept)          age
> >     economic.cond.national
> >      >     economic.cond.household
> >      >     Labour             0.9515214 -0.021913989
> 0.5575707
> >      >            0.15839096
> >      >     Liberal Democrat   1.4119306 -0.016810735
> 0.1810761
> >      >           -0.01196664
> >      >     Liberal Democrat   0.4604567  0.005102666
>  -0.3764928
> >      >           -0.17036682
> >      >     Conservative      -0.9514466  0.021912305
>  -0.5575644
> >      >           -0.15838744
> >      >                             Blair       Hague    Kennedy
> Europe
> >      >     political.knowledge
> >      >     Labour            0.8371764 -0.90775585  0.2513436 -0.22781308
> >      >     -0.5370612
> >      >     Liberal Democrat  0.2937331 -0.82217625  0.6710567 -0.20004624
> >      >     -0.2034605
> >      >     Liberal Democrat -0.5434408  0.08559455  0.4197027  0.02776465
> >      >     0.3336068
> >      >     Conservative     -0.8371670  0.90778068 -0.2513735  0.22781092
> >      >     0.5370545
> >      >                         gendermale
> >      >     Labour            0.13765774
> >      >     Liberal Democrat  0.12640823
> >      >     Liberal Democrat -0.01125898
> >      >     Conservative     -0.13764849
> >      >
> >      >       > c(logLik(m), logLik(m1)) # same fit to the data
> >      >     [1] -1141.922 -1141.922
> >      >
> >      >       > # covariance matrices for coefficients:
> >      >       > V <- vcov(m)
> >      >       > V1 <- vcov(m1)
> >      >       > cbind(colnames(V), colnames(V1)) # compare
> >      >             [,1]                                       [,2]
> >      >
> >      >        [1,] "Labour:(Intercept)"                       "Liberal
> >      >     Democrat:(Intercept)"
> >      >        [2,] "Labour:age"                               "Liberal
> >      >     Democrat:age"
> >      >
> >      >        [3,] "Labour:economic.cond.national"            "Liberal
> >      >     Democrat:economic.cond.national"
> >      >        [4,] "Labour:economic.cond.household"           "Liberal
> >      >     Democrat:economic.cond.household"
> >      >        [5,] "Labour:Blair"                             "Liberal
> >      >     Democrat:Blair"
> >      >        [6,] "Labour:Hague"                             "Liberal
> >      >     Democrat:Hague"
> >      >        [7,] "Labour:Kennedy"                           "Liberal
> >      >     Democrat:Kennedy"
> >      >        [8,] "Labour:Europe"                            "Liberal
> >      >     Democrat:Europe"
> >      >        [9,] "Labour:political.knowledge"               "Liberal
> >      >     Democrat:political.knowledge"
> >      >     [10,] "Labour:gendermale"                        "Liberal
> >      >     Democrat:gendermale"
> >      >     [11,] "Liberal Democrat:(Intercept)"
> >      >     "Conservative:(Intercept)"
> >      >     [12,] "Liberal Democrat:age"
> >       "Conservative:age"
> >      >
> >      >     [13,] "Liberal Democrat:economic.cond.national"
> >      >     "Conservative:economic.cond.national"
> >      >     [14,] "Liberal Democrat:economic.cond.household"
> >      >     "Conservative:economic.cond.household"
> >      >     [15,] "Liberal Democrat:Blair"
> >       "Conservative:Blair"
> >      >
> >      >     [16,] "Liberal Democrat:Hague"
> >       "Conservative:Hague"
> >      >
> >      >     [17,] "Liberal Democrat:Kennedy"
> >       "Conservative:Kennedy"
> >      >
> >      >     [18,] "Liberal Democrat:Europe"
> >     "Conservative:Europe"
> >      >
> >      >     [19,] "Liberal Democrat:political.knowledge"
> >      >     "Conservative:political.knowledge"
> >      >     [20,] "Liberal Democrat:gendermale"
> >      >     "Conservative:gendermale"
> >      >
> >      >       > int <- c(1, 11) # remove intercepts
> >      >       > colnames(V)[int]
> >      >     [1] "Labour:(Intercept)"           "Liberal
> Democrat:(Intercept)"
> >      >
> >      >       > colnames(V1)[int]
> >      >     [1] "Liberal Democrat:(Intercept)" "Conservative:(Intercept)"
> >      >       > V <- V[-int, -int]
> >      >       > V1 <- V1[-int, -int]
> >      >
> >      >       > age <- c(1, 10) # locate age coefficients
> >      >       > colnames(V)[age]
> >      >     [1] "Labour:age"           "Liberal Democrat:age"
> >      >       > colnames(V1)[age]
> >      >     [1] "Liberal Democrat:age" "Conservative:age"
> >      >
> >      >       > V <- cov2cor(V) # compute coefficient correlations
> >      >       > V1 <- cov2cor(V1)
> >      >
> >      >       > # compare GVIFs:
> >      >       > c(det(V[age, age])*det(V[-age, -age])/det(V),
> >      >     +   det(V1[age, age])*det(V1[-age, -age])/det(V1))
> >      >     [1] 1.046232 1.046229
> >      >
> >      >     --------- snip -----------
> >      >
> >      >     For curiosity, I applied car::vif() and
> >      >     performance::check_collinearity() to these models to see what
> >     they
> >      >     would
> >      >     do. Both returned the wrong answer. vif() produced a warning,
> but
> >      >     check_collinearity() didn't:
> >      >
> >      >     --------- snip -----------
> >      >
> >      >       > car::vif(m1)
> >      >                           age  economic.cond.national
> >      >     economic.cond.household
> >      >                     15.461045               22.137772
> >      >       16.693877
> >      >                         Blair                   Hague
> >      >       Kennedy
> >      >                     14.681562                7.483039
> >      >       15.812067
> >      >                        Europe     political.knowledge
> >      >     gender
> >      >                      6.502119                4.219507
> >      >     2.313885
> >      >     Warning message:
> >      >     In vif.default(m1) : No intercept: vifs may not be sensible.
> >      >
> >      >       > performance::check_collinearity(m)
> >      >     # Check for Multicollinearity
> >      >
> >      >     Low Correlation
> >      >
> >      >                           Term  VIF Increased SE Tolerance
> >      >                            age 1.72         1.31      0.58
> >      >         economic.cond.national 1.85         1.36      0.54
> >      >        economic.cond.household 1.86         1.37      0.54
> >      >                          Blair 1.63         1.28      0.61
> >      >                          Hague 1.94         1.39      0.52
> >      >                        Kennedy 1.70         1.30      0.59
> >      >                         Europe 2.01         1.42      0.50
> >      >            political.knowledge 1.94         1.39      0.52
> >      >                         gender 1.78         1.33      0.56
> >      >       > performance::check_collinearity(m1)
> >      >     # Check for Multicollinearity
> >      >
> >      >     Low Correlation
> >      >
> >      >                           Term  VIF Increased SE Tolerance
> >      >                            age 1.19         1.09      0.84
> >      >         economic.cond.national 1.42         1.19      0.70
> >      >        economic.cond.household 1.32         1.15      0.76
> >      >                          Blair 1.50         1.22      0.67
> >      >                          Hague 1.30         1.14      0.77
> >      >                        Kennedy 1.19         1.09      0.84
> >      >                         Europe 1.34         1.16      0.75
> >      >            political.knowledge 1.30         1.14      0.77
> >      >                         gender 1.23         1.11      0.81
> >      >
> >      >     --------- snip -----------
> >      >
> >      >     I looked at the code for vif() and check_collinearity() to
> >     see where
> >      >     they went wrong. Both failed to handle the two intercepts in
> >     the model
> >      >     correctly -- vif() thought there was no intercept and
> >      >     check_collinearity() just removed the first intercept but not
> the
> >      >     second.
> >      >
> >      >     In examining the code for check_collinearity(), I discovered a
> >      >     couple of
> >      >     additional disconcerting facts. First, part of the code seems
> >     to be
> >      >     copied from vif.default(). Second, as a consequence,
> >      >     check_collinearity() actually computes GVIFs rather than VIFs
> >     (and
> >      >     doesn't reference either the Fox and Monette paper
> >     introducing GVIFs or
> >      >     the car package) but doesn't seem to understand that, and,
> >     for example,
> >      >     takes the squareroot of the GVIF (reported in the column
> marked
> >      >     "Increased SE") rather than the 2p root (when there are p > 1
> >      >     coefficients in a term).
> >      >
> >      >     Here's the relevant code from the two functions (where . . .
> >     denotes
> >      >     elided lines) -- the default method for vif() and
> >      >     .check_collinearity(),
> >      >     which is called by check_collinearity.default():
> >      >
> >      >     --------- snip -----------
> >      >
> >      >       > car:::vif.default
> >      >     function (mod, ...)
> >      >     {
> >      >           . . .
> >      >           v <- vcov(mod)
> >      >           assign <- attr(model.matrix(mod), "assign")
> >      >           if (names(coefficients(mod)[1]) == "(Intercept)") {
> >      >               v <- v[-1, -1]
> >      >               assign <- assign[-1]
> >      >           }
> >      >           else warning("No intercept: vifs may not be sensible.")
> >      >           terms <- labels(terms(mod))
> >      >           n.terms <- length(terms)
> >      >           if (n.terms < 2)
> >      >               stop("model contains fewer than 2 terms")
> >      >           R <- cov2cor(v)
> >      >           detR <- det(R)
> >      >           . . .
> >      >           for (term in 1:n.terms) {
> >      >               subs <- which(assign == term)
> >      >               result[term, 1] <- det(as.matrix(R[subs, subs])) *
> >      >     det(as.matrix(R[-subs,
> >      >                   -subs]))/detR
> >      >               result[term, 2] <- length(subs)
> >      >           }
> >      >           . . .
> >      >     }
> >      >
> >      >       > performance:::.check_collinearity
> >      >     function (x, component, verbose = TRUE)
> >      >     {
> >      >           v <- insight::get_varcov(x, component = component,
> >     verbose =
> >      >     FALSE)
> >      >           assign <- .term_assignments(x, component, verbose =
> >     verbose)
> >      >           . . .
> >      >           if (insight::has_intercept(x)) {
> >      >               v <- v[-1, -1]
> >      >               assign <- assign[-1]
> >      >           }
> >      >           else {
> >      >               if (isTRUE(verbose)) {
> >      >                   warning("Model has no intercept. VIFs may not be
> >      >     sensible.",
> >      >                       call. = FALSE)
> >      >               }
> >      >           }
> >      >               . . .
> >      >               terms <- labels(stats::terms(f[[component]]))
> >      >               . . .
> >      >           n.terms <- length(terms)
> >      >           if (n.terms < 2) {
> >      >               if (isTRUE(verbose)) {
> >      >                   warning(insight::format_message(sprintf("Not
> >     enough model
> >      >     terms in the %s part of the model to check for
> >     multicollinearity.",
> >      >                       component)), call. = FALSE)
> >      >               }
> >      >               return(NULL)
> >      >           }
> >      >           R <- stats::cov2cor(v)
> >      >           detR <- det(R)
> >      >           . . .
> >      >           for (term in 1:n.terms) {
> >      >               subs <- which(assign == term)
> >      >                   . . .
> >      >                   result <- c(result, det(as.matrix(R[subs,
> subs])) *
> >      >                       det(as.matrix(R[-subs, -subs]))/detR)
> >      >                   . . .
> >      >           }
> >      >           . . .
> >      >     }
> >      >
> >      >     --------- snip -----------
> >      >
> >      >     So, the upshot of all this is that you should be able to do
> >     what you
> >      >     want, but not with either car::vif() or
> >      >     performance::check_collinearity(). Instead, either write your
> own
> >      >     function or do the computations in a script.
> >      >
> >      >     There's also a lesson here about S3 default methods: The fact
> >     that a
> >      >     default method returns a result rather than throwing an error
> >     or a
> >      >     warning doesn't mean that the result is the right answer.
> >      >
> >      >     I hope this helps,
> >      >        John
> >      >
> >      >
> >      >     On 2022-02-26 3:45 p.m., Juho Kristian Ruohonen wrote:
> >      >      > Dear John W,
> >      >      >
> >      >      > Thank you very much for the tip-off! Apologies for not
> >     responding
> >      >     earlier
> >      >      > (gmail apparently decided to direct your email right into
> the
> >      >     junk folder).
> >      >      > I am very pleased to note that the package you mention does
> >      >     indeed work
> >      >      > with *brms* multinomial models! Thanks again!
> >      >      >
> >      >      > Best,
> >      >      >
> >      >      > Juho
> >      >      >
> >      >      > pe 25. helmik. 2022 klo 19.23 John Willoughby
> >      >     (johnwillec using gmail.com <mailto:johnwillec using gmail.com>
> >     <mailto:johnwillec using gmail.com <mailto:johnwillec using gmail.com>>)
> >      >      > kirjoitti:
> >      >      >
> >      >      >> Have you tried the check_collinearity() function in the
> >     performance
> >      >      >> package? It's supposed to work on brms models, but
> whether it
> >      >     will work on
> >      >      >> a multinomial model I don't know.  It works well on mixed
> >     models
> >      >     generated
> >      >      >> by glmmTMB().
> >      >      >>
> >      >      >> John Willoughby
> >      >      >>
> >      >      >>
> >      >      >> On Fri, Feb 25, 2022 at 3:01 AM
> >      >     <r-sig-mixed-models-request using r-project.org
> >     <mailto:r-sig-mixed-models-request using r-project.org>
> >      >     <mailto:r-sig-mixed-models-request using r-project.org
> >     <mailto:r-sig-mixed-models-request using r-project.org>>>
> >      >      >> wrote:
> >      >      >>
> >      >      >>> Send R-sig-mixed-models mailing list submissions to
> >      >      >>> r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>
> >      >     <mailto:r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>>
> >      >      >>>
> >      >      >>> To subscribe or unsubscribe via the World Wide Web, visit
> >      >      >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
> >      >      >>> or, via email, send a message with subject or body
> 'help' to
> >      >      >>> r-sig-mixed-models-request using r-project.org
> >     <mailto:r-sig-mixed-models-request using r-project.org>
> >      >     <mailto:r-sig-mixed-models-request using r-project.org
> >     <mailto:r-sig-mixed-models-request using r-project.org>>
> >      >      >>>
> >      >      >>> You can reach the person managing the list at
> >      >      >>> r-sig-mixed-models-owner using r-project.org
> >     <mailto:r-sig-mixed-models-owner using r-project.org>
> >      >     <mailto:r-sig-mixed-models-owner using r-project.org
> >     <mailto:r-sig-mixed-models-owner using r-project.org>>
> >      >      >>>
> >      >      >>> When replying, please edit your Subject line so it is
> >     more specific
> >      >      >>> than "Re: Contents of R-sig-mixed-models digest..."
> >      >      >>>
> >      >      >>>
> >      >      >>> Today's Topics:
> >      >      >>>
> >      >      >>>     1. Collinearity diagnostics for (mixed) multinomial
> >     models
> >      >      >>>        (Juho Kristian Ruohonen)
> >      >      >>>
> >      >      >>>
> >      >
> >
>  ----------------------------------------------------------------------
> >      >      >>>
> >      >      >>> Message: 1
> >      >      >>> Date: Fri, 25 Feb 2022 10:23:25 +0200
> >      >      >>> From: Juho Kristian Ruohonen
> >     <juho.kristian.ruohonen using gmail.com
> >     <mailto:juho.kristian.ruohonen using gmail.com>
> >      >     <mailto:juho.kristian.ruohonen using gmail.com
> >     <mailto:juho.kristian.ruohonen using gmail.com>>>
> >      >      >>> To: John Fox <jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>
> >     <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>>
> >      >      >>> Cc: "r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>
> >      >     <mailto:r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>>"
> >      >      >>>          <r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>
> >      >     <mailto:r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>>>
> >      >      >>> Subject: [R-sig-ME] Collinearity diagnostics for (mixed)
> >      >     multinomial
> >      >      >>>          models
> >      >      >>> Message-ID:
> >      >      >>>          <
> >      >      >>>
> >      >
> >     CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
> >     <mailto:
> CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>
> >      >
> >       <mailto:
> CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
> <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
> >>>
> >      >      >>> Content-Type: text/plain; charset="utf-8"
> >      >      >>>
> >      >      >>> Dear John (and anyone else qualified to comment),
> >      >      >>>
> >      >      >>> I fit lots of mixed-effects multinomial models in my
> >     research,
> >      >     and I
> >      >      >> would
> >      >      >>> like to see some (multi)collinearity diagnostics on the
> >     fixed
> >      >     effects, of
> >      >      >>> which there are over 30. My models are fit using the
> >     Bayesian
> >      >     *brms*
> >      >      >>> package because I know of no frequentist packages with
> >      >     multinomial GLMM
> >      >      >>> compatibility.
> >      >      >>>
> >      >      >>> With continuous or dichotomous outcomes, my go-to
> >     function for
> >      >      >> calculating
> >      >      >>> multicollinearity diagnostics is of course *vif()* from
> >     the *car*
> >      >      >> package.
> >      >      >>> As expected, however, this function does not report
> sensible
> >      >     diagnostics
> >      >      >>> for multinomial models -- not even for standard ones fit
> >     by the
> >      >     *nnet*
> >      >      >>> package's *multinom()* function. The reason, I presume,
> is
> >      >     because a
> >      >      >>> multinomial model is not really one but C-1 regression
> >     models
> >      >     (where C
> >      >      >> is
> >      >      >>> the number of response categories) and the *vif()*
> >     function is not
> >      >      >> designed
> >      >      >>> to deal with this scenario.
> >      >      >>>
> >      >      >>> Therefore, in order to obtain meaningful collinearity
> >     metrics,
> >      >     my present
> >      >      >>> plan is to write a simple helper function that uses
> >     *vif() *to
> >      >     calculate
> >      >      >>> and present (generalized) variance inflation metrics for
> >     the C-1
> >      >      >>> sub-datasets to which the C-1 component binomial models
> >     of the
> >      >     overall
> >      >      >>> multinomial model are fit. In other words, it will
> partition
> >      >     the data
> >      >      >> into
> >      >      >>> those C-1 subsets, and then apply *vif()* to as many
> linear
> >      >     regressions
> >      >      >>> using a made-up continuous response and the fixed
> effects of
> >      >     interest.
> >      >      >>>
> >      >      >>> Does this seem like a sensible approach?
> >      >      >>>
> >      >      >>> Best,
> >      >      >>>
> >      >      >>> Juho
> >      >      >>>
> >      >      >>>
> >      >      >>>
> >      >      >>
> >      >      >>          [[alternative HTML version deleted]]
> >      >      >>
> >      >      >> _______________________________________________
> >      >      >> R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org>
> >      >     <mailto:R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org>> mailing list
> >      >      >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
> >      >      >>
> >      >      >
> >      >      >       [[alternative HTML version deleted]]
> >      >      >
> >      >      > _______________________________________________
> >      >      > R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org>
> >      >     <mailto:R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org>> mailing list
> >      >      > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
> >      >     --
> >      >     John Fox, Professor Emeritus
> >      >     McMaster University
> >      >     Hamilton, Ontario, Canada
> >      >     web: https://socialsciences.mcmaster.ca/jfox/
> >     <https://socialsciences.mcmaster.ca/jfox/>
> >      >     <https://socialsciences.mcmaster.ca/jfox/
> >     <https://socialsciences.mcmaster.ca/jfox/>>
> >      >
> >     --
> >     John Fox, Professor Emeritus
> >     McMaster University
> >     Hamilton, Ontario, Canada
> >     web: https://socialsciences.mcmaster.ca/jfox/
> >     <https://socialsciences.mcmaster.ca/jfox/>
> >
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list