[R-sig-ME] Collinearity diagnostics for (mixed) multinomial models

John Fox j|ox @end|ng |rom mcm@@ter@c@
Tue Mar 1 17:01:34 CET 2022


Dear Juho,

On 2022-03-01 8:24 a.m., Juho Kristian Ruohonen wrote:
> Dear John (Fox, as well as other list members),
> 
> I've now written a simple function to try and calculate GVIFS for all 
> predictors in a nnet::multinom() object based on John's example code. If 
> its results are correct (see below), I will proceed to write a version 
> that also works with mixed-effects multinomial models fit by 
> brms::brm(). Here's the code:
> 
>     gvif.multinom <- function(model){
>        (classes <- model$lev)
>        (V.all <- vcov(model))
>        (V.noIntercepts <- V.all[!grepl("\\(Intercept\\)$",
>     rownames(V.all), perl = T),
>                                 !grepl("\\(Intercept\\)$",
>     colnames(V.all), perl = T)])
>        (R <- cov2cor(V.noIntercepts))
>        (terms <- attr(model$terms, "term.labels"))
>        (gvif <- numeric(length = length(terms)))
>        (names(gvif) <- terms)
>        (SE.multiplier <- numeric(length = length(terms)))
>        (names(SE.multiplier) <- terms)
>        #The line below tries to capture all factor levels into a regex
>     for coef name matching.
>        (LevelsRegex <- paste0("(", paste(unlist(model$xlevels), collapse
>     = "|"),")?"))
> 
>        for(i in terms){
>          #The regex stuff below tries to ensure all interaction
>     coefficients are matched, including those involving factors.
>          if(grepl(":", i)){
>            (termname <- gsub(":", paste0(LevelsRegex, ":"), i, perl = T))
>          }else{termname <- i}
>          (RegexToMatch <- paste0("^(", paste(classes[2:length(classes)],
>     collapse = "|") ,"):", termname, LevelsRegex, "$"))
> 
>          #Now the actual calculation:
>          (indices <- grep(RegexToMatch, rownames(R), perl = T))
>          (gvif[i] <- det(R[indices, indices]) * det(R[-indices,
>     -indices]) / det(R))
>          (SE.multiplier[i] <- gvif[i]^(1/(2*length(indices))))
>        }
>        #Put the results together and order them by degree of SE inflation:
>        (result <- cbind(GVIF = gvif, `GVIF^(1/(2df))` = SE.multiplier))
>        return(result[order(result[,"GVIF^(1/(2df))"], decreasing = T),])}
> 
> 
> The results seem correct to me when applied to John's example model fit 
> to the BEPS data. However, that dataset contains no multi-df factors, of 
> which my own models have many. Below is a maximally simple example with 
> one multi-df factor (/region/):
> 
>     mod1 <- multinom(partic ~., data = carData::Womenlf)
>     gvif.multinom(mod1)
> 
>     GVIF GVIF^(1/(2df))
>     children 1.298794       1.067542
>     hincome  1.184215       1.043176
>     region   1.381480       1.020403
> 
> 
> These results look plausible to me. Finally, below is an example 
> involving both a multi-df factor and an interaction:
> 
>     mod2 <- update(mod1, ~. +children:region)
>     gvif.multinom(mod2)
> 
>                              GVIF GVIF^(1/(2df))
>     children:region 4.965762e+16      11.053482
>     region          1.420418e+16      10.221768
>     children        1.471412e+03       6.193463
>     hincome         6.462161e+00       1.594390
> 
> 
> These results look a bit more dubious. To be sure, it is to be expected 
> that interaction terms will introduce a lot of collinearity. But an 
> 11-fold increase in SE? I hope someone can tell me whether this is 
> correct or not!

You don't need someone else to check your work because you could just 
apply the simple function that I sent you yesterday, which, though not 
automatic, computes the GVIFs in a transparent manner.

A brief comment on GVIFs for models with interactions (this isn't the 
place to discuss the question in detail): The Fox and Monette JASA paper 
addresses the question briefly in the context of a two-way ANOVA, but I 
don't think that the approach suggested there is easily generalized.

The following simple approach pays attention to what's invariant under 
different parametrizations of the RHS side of the model: Simultaneously 
check the collinearity of all of the coefficients of an interaction 
together with the main effects and, potentially, lower-order 
interactions that are marginal to it. So, e.g., in the model y ~ a + b + 
a:b + c, you'd check all of the coefficients for a, b, and a:b together.

Alternatively, one could focus in turn on each explanatory variable and 
check the collinearity of all coefficients to which it is marginal. So 
in y ~ a + b + c + a:b + a:c + d, when you focus on a, you'd look at all 
of the coefficients for a, b, c, a:b, and a:c.

John

> 
> Best,
> 
> Juho
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ti 1. maalisk. 2022 klo 0.05 John Fox (jfox using mcmaster.ca 
> <mailto:jfox using mcmaster.ca>) kirjoitti:
> 
>     Dear Juha,
> 
>     On 2022-02-28 5:00 p.m., Juho Kristian Ruohonen wrote:
>      > Apologies for my misreading, John, and many thanks for showing
>     how the
>      > calculation is done for a single term.
>      >
>      > Do you think *vif()* might be updated in the near future with the
>      > capability of auto-detecting a multinomial model and returning
>      > mathematically correct GVIF statistics?
> 
>     The thought crossed my mind, but I'd want to do it in a general way,
>     not
>     just for the multinom() function, and in a way that avoids incorrect
>     results such as those currently produced for "multinom" models, albeit
>     with a warning. I can't guarantee whether or when I'll be able to do
>     that.
> 
>     John
> 
>      >
>      > If not, I'll proceed to writing my own function based on your
>     example.
>      > However, /car/ is such an excellent and widely used package that the
>      > greatest benefit to mankind would probably accrue if /car /was
>     upgraded
>      > with this feature sooner rather than later.
>      >
>      > Best,
>      >
>      > Juho
>      >
>      >
>      >
>      >
>      >
>      >
>      >
>      >
>      >
>      > ma 28. helmik. 2022 klo 17.08 John Fox (jfox using mcmaster.ca
>     <mailto:jfox using mcmaster.ca>
>      > <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>) kirjoitti:
>      >
>      >     Dear Juho,
>      >
>      >     On 2022-02-28 2:06 a.m., Juho Kristian Ruohonen wrote:
>      >      > Dear Professor Fox and other list members,
>      >      >
>      >      > Profuse thanks for doing that detective work for me! I myself
>      >     thought
>      >      > the inflation factors reported by check_collinearity() were
>      >     suspiciously
>      >      > high, but unlike you I lacked the expertise to identify
>     what was
>      >     going on.
>      >      >
>      >      > As for your suggested approach, have I understood this
>     correctly:
>      >      >
>      >      > Since there doesn't yet exist an R function that will
>     calculate the
>      >      > (G)VIFS of multinomial models correctly, my best bet for
>     now is
>      >     just to
>      >      > ignore the fact that such models partition the data into C-1
>      >     subsets,
>      >      > and to calculate approximate GVIFs from the entire dataset at
>      >     once as if
>      >      > the response were continuous? And a simple way to do this
>     is to
>      >      > construct a fake continuous response, call
>     *lm(fakeresponse ~.)*,
>      >     and
>      >      > apply *car::vif()* on the result?
>      >
>      >     No, you misunderstand my suggestion, which perhaps isn't
>     surprising
>      >     given the length of my message. What you propose is what I
>     suggested as
>      >     a rough approximation *before* I confirmed that my guess of the
>      >     solution
>      >     was correct.
>      >
>      >     The R code that I sent yesterday showed how to compute the
>     GVIF for a
>      >     multinomial regression model, and I suggested that you write
>     either a
>      >     script or a simple function to do that. Here's a function
>     that will
>      >     work
>      >     for a model object that responds to vcov():
>      >
>      >     GVIF <- function(model, intercepts, term){
>      >         # model: regression model object
>      >         # intercepts: row/column positions of intercepts in the
>     coefficient
>      >     covariance matrix
>      >         # term: row/column positions of the coefficients for the
>     focal term
>      >         V <- vcov(model)
>      >         term <- colnames(V)[term]
>      >         V <- V[-intercepts, -intercepts]
>      >         V <- cov2cor(V)
>      >         term <- which(colnames(V) %in% term)
>      >         gvif <- det(V[term, term])*det(V[-term, -term])/det(V)
>      >         c(GVIF=gvif, "GVIF^(1/(2*p))"=gvif^(1/(2*length(term))))
>      >     }
>      >
>      >     and here's an application to the multinom() example that I
>     showed you
>      >     yesterday:
>      >
>      >       > colnames(vcov(m)) # to get coefficient positions
>      >        [1] "Labour:(Intercept)"                       "Labour:age"
>      >
>      >        [3] "Labour:economic.cond.national"
>      >     "Labour:economic.cond.household"
>      >        [5] "Labour:Blair"                             "Labour:Hague"
>      >
>      >        [7] "Labour:Kennedy"                           "Labour:Europe"
>      >
>      >        [9] "Labour:political.knowledge"             
>       "Labour:gendermale"
>      >
>      >     [11] "Liberal Democrat:(Intercept)"             "Liberal
>     Democrat:age"
>      >
>      >     [13] "Liberal Democrat:economic.cond.national"  "Liberal
>      >     Democrat:economic.cond.household"
>      >     [15] "Liberal Democrat:Blair"                   "Liberal
>      >     Democrat:Hague"
>      >
>      >     [17] "Liberal Democrat:Kennedy"                 "Liberal
>      >     Democrat:Europe"
>      >     [19] "Liberal Democrat:political.knowledge"     "Liberal
>      >     Democrat:gendermale"
>      >
>      >       > GVIF(m, intercepts=c(1, 11), term=c(2, 12)) # GVIF for age
>      >                 GVIF GVIF^(1/(2*p))
>      >             1.046232       1.011363
>      >
>      >
>      >     Finally, here's what you get for a linear model with the same RHS
>      >     (where
>      >     the sqrt(VIF) should be a rough approximation to GVIF^(1/4)
>     reported by
>      >     my GVIF() function):
>      >
>      >       > m.lm <- lm(as.numeric(vote) ~ . - vote1, data=BEPS)
>      >       > sqrt(car::vif(m.lm))
>      >                           age  economic.cond.national
>      >     economic.cond.household
>      >                         Blair
>      >                      1.006508                1.124132
>      >     1.075656
>      >                      1.118441
>      >                         Hague                 Kennedy
>      >     Europe
>      >           political.knowledge
>      >                      1.066799                1.015532
>      >     1.101741
>      >                      1.028546
>      >                        gender
>      >                      1.017386
>      >
>      >
>      >     John
>      >
>      >      >
>      >      > Best,
>      >      >
>      >      > Juho
>      >      >
>      >      > ma 28. helmik. 2022 klo 2.23 John Fox (jfox using mcmaster.ca
>     <mailto:jfox using mcmaster.ca>
>      >     <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>
>      >      > <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>
>     <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>>) kirjoitti:
>      >      >
>      >      >     Dear Juho,
>      >      >
>      >      >     I've now had a chance to think about this problem some
>     more,
>      >     and I
>      >      >     believe that the approach I suggested is correct. I
>     also had an
>      >      >     opportunity to talk the problem over a bit with Georges
>      >     Monette, who
>      >      >     coauthored the paper that introduced generalized variance
>      >     inflation
>      >      >     factors (GVIFs). On the other hand, the results
>     produced by
>      >      >     performance::check_collinearity() for multinomial logit
>      >     models don't
>      >      >     seem to be correct (see below).
>      >      >
>      >      >     Here's an example, using the nnet::multinom() function
>     to fit a
>      >      >     multinomial logit model, with alternative
>     parametrizations of the
>      >      >     LHS of
>      >      >     the model:
>      >      >
>      >      >     --------- snip -----------
>      >      >
>      >      >       > library(nnet) # for multinom()
>      >      >       > library(carData) # for BEPS data set
>      >      >
>      >      >       > # alternative ordering of the response levels:
>      >      >       > BEPS$vote1 <- factor(BEPS$vote, levels=c("Labour",
>     "Liberal
>      >      >     Democrat", "Conservative"))
>      >      >       > levels(BEPS$vote)
>      >      >     [1] "Conservative"     "Labour"           "Liberal
>     Democrat"
>      >      >       > levels(BEPS$vote1)
>      >      >     [1] "Labour"           "Liberal Democrat" "Conservative"
>      >      >
>      >      >       > m <- multinom(vote ~ . - vote1, data=BEPS)
>      >      >     # weights:  33 (20 variable)
>      >      >     initial  value 1675.383740
>      >      >     iter  10 value 1345.935273
>      >      >     iter  20 value 1150.956807
>      >      >     iter  30 value 1141.921662
>      >      >     iter  30 value 1141.921661
>      >      >     iter  30 value 1141.921661
>      >      >     final  value 1141.921661
>      >      >     converged
>      >      >       > m1 <- multinom(vote1 ~ . - vote, data=BEPS)
>      >      >     # weights:  33 (20 variable)
>      >      >     initial  value 1675.383740
>      >      >     iter  10 value 1280.439304
>      >      >     iter  20 value 1165.513772
>      >      >     final  value 1141.921662
>      >      >     converged
>      >      >
>      >      >       > rbind(coef(m), coef(m1)) # compare coefficients
>      >      >                        (Intercept)          age
>      >     economic.cond.national
>      >      >     economic.cond.household
>      >      >     Labour             0.9515214 -0.021913989             
>     0.5575707
>      >      >            0.15839096
>      >      >     Liberal Democrat   1.4119306 -0.016810735             
>     0.1810761
>      >      >           -0.01196664
>      >      >     Liberal Democrat   0.4604567  0.005102666           
>       -0.3764928
>      >      >           -0.17036682
>      >      >     Conservative      -0.9514466  0.021912305           
>       -0.5575644
>      >      >           -0.15838744
>      >      >                             Blair       Hague    Kennedy 
>          Europe
>      >      >     political.knowledge
>      >      >     Labour            0.8371764 -0.90775585  0.2513436
>     -0.22781308
>      >      >     -0.5370612
>      >      >     Liberal Democrat  0.2937331 -0.82217625  0.6710567
>     -0.20004624
>      >      >     -0.2034605
>      >      >     Liberal Democrat -0.5434408  0.08559455  0.4197027 
>     0.02776465
>      >      >     0.3336068
>      >      >     Conservative     -0.8371670  0.90778068 -0.2513735 
>     0.22781092
>      >      >     0.5370545
>      >      >                         gendermale
>      >      >     Labour            0.13765774
>      >      >     Liberal Democrat  0.12640823
>      >      >     Liberal Democrat -0.01125898
>      >      >     Conservative     -0.13764849
>      >      >
>      >      >       > c(logLik(m), logLik(m1)) # same fit to the data
>      >      >     [1] -1141.922 -1141.922
>      >      >
>      >      >       > # covariance matrices for coefficients:
>      >      >       > V <- vcov(m)
>      >      >       > V1 <- vcov(m1)
>      >      >       > cbind(colnames(V), colnames(V1)) # compare
>      >      >             [,1]                                       [,2]
>      >      >
>      >      >        [1,] "Labour:(Intercept)"                     
>       "Liberal
>      >      >     Democrat:(Intercept)"
>      >      >        [2,] "Labour:age"                             
>       "Liberal
>      >      >     Democrat:age"
>      >      >
>      >      >        [3,] "Labour:economic.cond.national"           
>     "Liberal
>      >      >     Democrat:economic.cond.national"
>      >      >        [4,] "Labour:economic.cond.household"         
>       "Liberal
>      >      >     Democrat:economic.cond.household"
>      >      >        [5,] "Labour:Blair"                           
>       "Liberal
>      >      >     Democrat:Blair"
>      >      >        [6,] "Labour:Hague"                           
>       "Liberal
>      >      >     Democrat:Hague"
>      >      >        [7,] "Labour:Kennedy"                         
>       "Liberal
>      >      >     Democrat:Kennedy"
>      >      >        [8,] "Labour:Europe"                           
>     "Liberal
>      >      >     Democrat:Europe"
>      >      >        [9,] "Labour:political.knowledge"             
>       "Liberal
>      >      >     Democrat:political.knowledge"
>      >      >     [10,] "Labour:gendermale"                        "Liberal
>      >      >     Democrat:gendermale"
>      >      >     [11,] "Liberal Democrat:(Intercept)"
>      >      >     "Conservative:(Intercept)"
>      >      >     [12,] "Liberal Democrat:age"
>      >       "Conservative:age"
>      >      >
>      >      >     [13,] "Liberal Democrat:economic.cond.national"
>      >      >     "Conservative:economic.cond.national"
>      >      >     [14,] "Liberal Democrat:economic.cond.household"
>      >      >     "Conservative:economic.cond.household"
>      >      >     [15,] "Liberal Democrat:Blair"
>      >       "Conservative:Blair"
>      >      >
>      >      >     [16,] "Liberal Democrat:Hague"
>      >       "Conservative:Hague"
>      >      >
>      >      >     [17,] "Liberal Democrat:Kennedy"
>      >       "Conservative:Kennedy"
>      >      >
>      >      >     [18,] "Liberal Democrat:Europe"
>      >     "Conservative:Europe"
>      >      >
>      >      >     [19,] "Liberal Democrat:political.knowledge"
>      >      >     "Conservative:political.knowledge"
>      >      >     [20,] "Liberal Democrat:gendermale"
>      >      >     "Conservative:gendermale"
>      >      >
>      >      >       > int <- c(1, 11) # remove intercepts
>      >      >       > colnames(V)[int]
>      >      >     [1] "Labour:(Intercept)"           "Liberal
>     Democrat:(Intercept)"
>      >      >
>      >      >       > colnames(V1)[int]
>      >      >     [1] "Liberal Democrat:(Intercept)"
>     "Conservative:(Intercept)"
>      >      >       > V <- V[-int, -int]
>      >      >       > V1 <- V1[-int, -int]
>      >      >
>      >      >       > age <- c(1, 10) # locate age coefficients
>      >      >       > colnames(V)[age]
>      >      >     [1] "Labour:age"           "Liberal Democrat:age"
>      >      >       > colnames(V1)[age]
>      >      >     [1] "Liberal Democrat:age" "Conservative:age"
>      >      >
>      >      >       > V <- cov2cor(V) # compute coefficient correlations
>      >      >       > V1 <- cov2cor(V1)
>      >      >
>      >      >       > # compare GVIFs:
>      >      >       > c(det(V[age, age])*det(V[-age, -age])/det(V),
>      >      >     +   det(V1[age, age])*det(V1[-age, -age])/det(V1))
>      >      >     [1] 1.046232 1.046229
>      >      >
>      >      >     --------- snip -----------
>      >      >
>      >      >     For curiosity, I applied car::vif() and
>      >      >     performance::check_collinearity() to these models to
>     see what
>      >     they
>      >      >     would
>      >      >     do. Both returned the wrong answer. vif() produced a
>     warning, but
>      >      >     check_collinearity() didn't:
>      >      >
>      >      >     --------- snip -----------
>      >      >
>      >      >       > car::vif(m1)
>      >      >                           age  economic.cond.national
>      >      >     economic.cond.household
>      >      >                     15.461045               22.137772
>      >      >       16.693877
>      >      >                         Blair                   Hague
>      >      >       Kennedy
>      >      >                     14.681562                7.483039
>      >      >       15.812067
>      >      >                        Europe     political.knowledge
>      >      >     gender
>      >      >                      6.502119                4.219507
>      >      >     2.313885
>      >      >     Warning message:
>      >      >     In vif.default(m1) : No intercept: vifs may not be
>     sensible.
>      >      >
>      >      >       > performance::check_collinearity(m)
>      >      >     # Check for Multicollinearity
>      >      >
>      >      >     Low Correlation
>      >      >
>      >      >                           Term  VIF Increased SE Tolerance
>      >      >                            age 1.72         1.31      0.58
>      >      >         economic.cond.national 1.85         1.36      0.54
>      >      >        economic.cond.household 1.86         1.37      0.54
>      >      >                          Blair 1.63         1.28      0.61
>      >      >                          Hague 1.94         1.39      0.52
>      >      >                        Kennedy 1.70         1.30      0.59
>      >      >                         Europe 2.01         1.42      0.50
>      >      >            political.knowledge 1.94         1.39      0.52
>      >      >                         gender 1.78         1.33      0.56
>      >      >       > performance::check_collinearity(m1)
>      >      >     # Check for Multicollinearity
>      >      >
>      >      >     Low Correlation
>      >      >
>      >      >                           Term  VIF Increased SE Tolerance
>      >      >                            age 1.19         1.09      0.84
>      >      >         economic.cond.national 1.42         1.19      0.70
>      >      >        economic.cond.household 1.32         1.15      0.76
>      >      >                          Blair 1.50         1.22      0.67
>      >      >                          Hague 1.30         1.14      0.77
>      >      >                        Kennedy 1.19         1.09      0.84
>      >      >                         Europe 1.34         1.16      0.75
>      >      >            political.knowledge 1.30         1.14      0.77
>      >      >                         gender 1.23         1.11      0.81
>      >      >
>      >      >     --------- snip -----------
>      >      >
>      >      >     I looked at the code for vif() and check_collinearity() to
>      >     see where
>      >      >     they went wrong. Both failed to handle the two
>     intercepts in
>      >     the model
>      >      >     correctly -- vif() thought there was no intercept and
>      >      >     check_collinearity() just removed the first intercept
>     but not the
>      >      >     second.
>      >      >
>      >      >     In examining the code for check_collinearity(), I
>     discovered a
>      >      >     couple of
>      >      >     additional disconcerting facts. First, part of the
>     code seems
>      >     to be
>      >      >     copied from vif.default(). Second, as a consequence,
>      >      >     check_collinearity() actually computes GVIFs rather
>     than VIFs
>      >     (and
>      >      >     doesn't reference either the Fox and Monette paper
>      >     introducing GVIFs or
>      >      >     the car package) but doesn't seem to understand that, and,
>      >     for example,
>      >      >     takes the squareroot of the GVIF (reported in the
>     column marked
>      >      >     "Increased SE") rather than the 2p root (when there
>     are p > 1
>      >      >     coefficients in a term).
>      >      >
>      >      >     Here's the relevant code from the two functions (where
>     . . .
>      >     denotes
>      >      >     elided lines) -- the default method for vif() and
>      >      >     .check_collinearity(),
>      >      >     which is called by check_collinearity.default():
>      >      >
>      >      >     --------- snip -----------
>      >      >
>      >      >       > car:::vif.default
>      >      >     function (mod, ...)
>      >      >     {
>      >      >           . . .
>      >      >           v <- vcov(mod)
>      >      >           assign <- attr(model.matrix(mod), "assign")
>      >      >           if (names(coefficients(mod)[1]) == "(Intercept)") {
>      >      >               v <- v[-1, -1]
>      >      >               assign <- assign[-1]
>      >      >           }
>      >      >           else warning("No intercept: vifs may not be
>     sensible.")
>      >      >           terms <- labels(terms(mod))
>      >      >           n.terms <- length(terms)
>      >      >           if (n.terms < 2)
>      >      >               stop("model contains fewer than 2 terms")
>      >      >           R <- cov2cor(v)
>      >      >           detR <- det(R)
>      >      >           . . .
>      >      >           for (term in 1:n.terms) {
>      >      >               subs <- which(assign == term)
>      >      >               result[term, 1] <- det(as.matrix(R[subs,
>     subs])) *
>      >      >     det(as.matrix(R[-subs,
>      >      >                   -subs]))/detR
>      >      >               result[term, 2] <- length(subs)
>      >      >           }
>      >      >           . . .
>      >      >     }
>      >      >
>      >      >       > performance:::.check_collinearity
>      >      >     function (x, component, verbose = TRUE)
>      >      >     {
>      >      >           v <- insight::get_varcov(x, component = component,
>      >     verbose =
>      >      >     FALSE)
>      >      >           assign <- .term_assignments(x, component, verbose =
>      >     verbose)
>      >      >           . . .
>      >      >           if (insight::has_intercept(x)) {
>      >      >               v <- v[-1, -1]
>      >      >               assign <- assign[-1]
>      >      >           }
>      >      >           else {
>      >      >               if (isTRUE(verbose)) {
>      >      >                   warning("Model has no intercept. VIFs
>     may not be
>      >      >     sensible.",
>      >      >                       call. = FALSE)
>      >      >               }
>      >      >           }
>      >      >               . . .
>      >      >               terms <- labels(stats::terms(f[[component]]))
>      >      >               . . .
>      >      >           n.terms <- length(terms)
>      >      >           if (n.terms < 2) {
>      >      >               if (isTRUE(verbose)) {
>      >      >                   warning(insight::format_message(sprintf("Not
>      >     enough model
>      >      >     terms in the %s part of the model to check for
>      >     multicollinearity.",
>      >      >                       component)), call. = FALSE)
>      >      >               }
>      >      >               return(NULL)
>      >      >           }
>      >      >           R <- stats::cov2cor(v)
>      >      >           detR <- det(R)
>      >      >           . . .
>      >      >           for (term in 1:n.terms) {
>      >      >               subs <- which(assign == term)
>      >      >                   . . .
>      >      >                   result <- c(result,
>     det(as.matrix(R[subs, subs])) *
>      >      >                       det(as.matrix(R[-subs, -subs]))/detR)
>      >      >                   . . .
>      >      >           }
>      >      >           . . .
>      >      >     }
>      >      >
>      >      >     --------- snip -----------
>      >      >
>      >      >     So, the upshot of all this is that you should be able
>     to do
>      >     what you
>      >      >     want, but not with either car::vif() or
>      >      >     performance::check_collinearity(). Instead, either
>     write your own
>      >      >     function or do the computations in a script.
>      >      >
>      >      >     There's also a lesson here about S3 default methods:
>     The fact
>      >     that a
>      >      >     default method returns a result rather than throwing
>     an error
>      >     or a
>      >      >     warning doesn't mean that the result is the right answer.
>      >      >
>      >      >     I hope this helps,
>      >      >        John
>      >      >
>      >      >
>      >      >     On 2022-02-26 3:45 p.m., Juho Kristian Ruohonen wrote:
>      >      >      > Dear John W,
>      >      >      >
>      >      >      > Thank you very much for the tip-off! Apologies for not
>      >     responding
>      >      >     earlier
>      >      >      > (gmail apparently decided to direct your email
>     right into the
>      >      >     junk folder).
>      >      >      > I am very pleased to note that the package you
>     mention does
>      >      >     indeed work
>      >      >      > with *brms* multinomial models! Thanks again!
>      >      >      >
>      >      >      > Best,
>      >      >      >
>      >      >      > Juho
>      >      >      >
>      >      >      > pe 25. helmik. 2022 klo 19.23 John Willoughby
>      >      >     (johnwillec using gmail.com <mailto:johnwillec using gmail.com>
>     <mailto:johnwillec using gmail.com <mailto:johnwillec using gmail.com>>
>      >     <mailto:johnwillec using gmail.com <mailto:johnwillec using gmail.com>
>     <mailto:johnwillec using gmail.com <mailto:johnwillec using gmail.com>>>)
>      >      >      > kirjoitti:
>      >      >      >
>      >      >      >> Have you tried the check_collinearity() function
>     in the
>      >     performance
>      >      >      >> package? It's supposed to work on brms models, but
>     whether it
>      >      >     will work on
>      >      >      >> a multinomial model I don't know.  It works well
>     on mixed
>      >     models
>      >      >     generated
>      >      >      >> by glmmTMB().
>      >      >      >>
>      >      >      >> John Willoughby
>      >      >      >>
>      >      >      >>
>      >      >      >> On Fri, Feb 25, 2022 at 3:01 AM
>      >      >     <r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>
>      >     <mailto:r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>>
>      >      >     <mailto:r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>
>      >     <mailto:r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>>>>
>      >      >      >> wrote:
>      >      >      >>
>      >      >      >>> Send R-sig-mixed-models mailing list submissions to
>      >      >      >>> r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>
>      >      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>>
>      >      >      >>>
>      >      >      >>> To subscribe or unsubscribe via the World Wide
>     Web, visit
>      >      >      >>>
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
>      >      >   
>       <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>>
>      >      >      >>> or, via email, send a message with subject or
>     body 'help' to
>      >      >      >>> r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>
>      >     <mailto:r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>>
>      >      >     <mailto:r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>
>      >     <mailto:r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>>>
>      >      >      >>>
>      >      >      >>> You can reach the person managing the list at
>      >      >      >>> r-sig-mixed-models-owner using r-project.org
>     <mailto:r-sig-mixed-models-owner using r-project.org>
>      >     <mailto:r-sig-mixed-models-owner using r-project.org
>     <mailto:r-sig-mixed-models-owner using r-project.org>>
>      >      >     <mailto:r-sig-mixed-models-owner using r-project.org
>     <mailto:r-sig-mixed-models-owner using r-project.org>
>      >     <mailto:r-sig-mixed-models-owner using r-project.org
>     <mailto:r-sig-mixed-models-owner using r-project.org>>>
>      >      >      >>>
>      >      >      >>> When replying, please edit your Subject line so it is
>      >     more specific
>      >      >      >>> than "Re: Contents of R-sig-mixed-models digest..."
>      >      >      >>>
>      >      >      >>>
>      >      >      >>> Today's Topics:
>      >      >      >>>
>      >      >      >>>     1. Collinearity diagnostics for (mixed)
>     multinomial
>      >     models
>      >      >      >>>        (Juho Kristian Ruohonen)
>      >      >      >>>
>      >      >      >>>
>      >      >
>      >     
>       ----------------------------------------------------------------------
>      >      >      >>>
>      >      >      >>> Message: 1
>      >      >      >>> Date: Fri, 25 Feb 2022 10:23:25 +0200
>      >      >      >>> From: Juho Kristian Ruohonen
>      >     <juho.kristian.ruohonen using gmail.com
>     <mailto:juho.kristian.ruohonen using gmail.com>
>      >     <mailto:juho.kristian.ruohonen using gmail.com
>     <mailto:juho.kristian.ruohonen using gmail.com>>
>      >      >     <mailto:juho.kristian.ruohonen using gmail.com
>     <mailto:juho.kristian.ruohonen using gmail.com>
>      >     <mailto:juho.kristian.ruohonen using gmail.com
>     <mailto:juho.kristian.ruohonen using gmail.com>>>>
>      >      >      >>> To: John Fox <jfox using mcmaster.ca
>     <mailto:jfox using mcmaster.ca> <mailto:jfox using mcmaster.ca
>     <mailto:jfox using mcmaster.ca>>
>      >     <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>
>     <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>>>
>      >      >      >>> Cc: "r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>
>      >      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>>"
>      >      >      >>>          <r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>
>      >      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>>>
>      >      >      >>> Subject: [R-sig-ME] Collinearity diagnostics for
>     (mixed)
>      >      >     multinomial
>      >      >      >>>          models
>      >      >      >>> Message-ID:
>      >      >      >>>          <
>      >      >      >>>
>      >      >
>      >
>     CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
>     <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>
>      >   
>       <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>>
>      >      >
>      >     
>       <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com> <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>>>>
>      >      >      >>> Content-Type: text/plain; charset="utf-8"
>      >      >      >>>
>      >      >      >>> Dear John (and anyone else qualified to comment),
>      >      >      >>>
>      >      >      >>> I fit lots of mixed-effects multinomial models in my
>      >     research,
>      >      >     and I
>      >      >      >> would
>      >      >      >>> like to see some (multi)collinearity diagnostics
>     on the
>      >     fixed
>      >      >     effects, of
>      >      >      >>> which there are over 30. My models are fit using the
>      >     Bayesian
>      >      >     *brms*
>      >      >      >>> package because I know of no frequentist packages
>     with
>      >      >     multinomial GLMM
>      >      >      >>> compatibility.
>      >      >      >>>
>      >      >      >>> With continuous or dichotomous outcomes, my go-to
>      >     function for
>      >      >      >> calculating
>      >      >      >>> multicollinearity diagnostics is of course
>     *vif()* from
>      >     the *car*
>      >      >      >> package.
>      >      >      >>> As expected, however, this function does not
>     report sensible
>      >      >     diagnostics
>      >      >      >>> for multinomial models -- not even for standard
>     ones fit
>      >     by the
>      >      >     *nnet*
>      >      >      >>> package's *multinom()* function. The reason, I
>     presume, is
>      >      >     because a
>      >      >      >>> multinomial model is not really one but C-1
>     regression
>      >     models
>      >      >     (where C
>      >      >      >> is
>      >      >      >>> the number of response categories) and the *vif()*
>      >     function is not
>      >      >      >> designed
>      >      >      >>> to deal with this scenario.
>      >      >      >>>
>      >      >      >>> Therefore, in order to obtain meaningful collinearity
>      >     metrics,
>      >      >     my present
>      >      >      >>> plan is to write a simple helper function that uses
>      >     *vif() *to
>      >      >     calculate
>      >      >      >>> and present (generalized) variance inflation
>     metrics for
>      >     the C-1
>      >      >      >>> sub-datasets to which the C-1 component binomial
>     models
>      >     of the
>      >      >     overall
>      >      >      >>> multinomial model are fit. In other words, it
>     will partition
>      >      >     the data
>      >      >      >> into
>      >      >      >>> those C-1 subsets, and then apply *vif()* to as
>     many linear
>      >      >     regressions
>      >      >      >>> using a made-up continuous response and the fixed
>     effects of
>      >      >     interest.
>      >      >      >>>
>      >      >      >>> Does this seem like a sensible approach?
>      >      >      >>>
>      >      >      >>> Best,
>      >      >      >>>
>      >      >      >>> Juho
>      >      >      >>>
>      >      >      >>>
>      >      >      >>>
>      >      >      >>
>      >      >      >>          [[alternative HTML version deleted]]
>      >      >      >>
>      >      >      >> _______________________________________________
>      >      >      >> R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>
>      >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>>
>      >      >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>
>      >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>>> mailing list
>      >      >      >>
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
>      >      >   
>       <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>>
>      >      >      >>
>      >      >      >
>      >      >      >       [[alternative HTML version deleted]]
>      >      >      >
>      >      >      > _______________________________________________
>      >      >      > R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>
>      >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>>
>      >      >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>
>      >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>>> mailing list
>      >      >      >
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
>      >      >   
>       <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>>
>      >      >     --
>      >      >     John Fox, Professor Emeritus
>      >      >     McMaster University
>      >      >     Hamilton, Ontario, Canada
>      >      >     web: https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>
>      >     <https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>>
>      >      >     <https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>
>      >     <https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>>>
>      >      >
>      >     --
>      >     John Fox, Professor Emeritus
>      >     McMaster University
>      >     Hamilton, Ontario, Canada
>      >     web: https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>
>      >     <https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>>
>      >
>     -- 
>     John Fox, Professor Emeritus
>     McMaster University
>     Hamilton, Ontario, Canada
>     web: https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>
> 
------------------------------------------------------------------------
-- 
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/



More information about the R-sig-mixed-models mailing list