[R-sig-ME] Collinearity diagnostics for (mixed) multinomial models

Mon Feb 28 16:08:06 CET 2022

Dear Juho,

On 2022-02-28 2:06 a.m., Juho Kristian Ruohonen wrote:
> Dear Professor Fox and other list members,
> 
> Profuse thanks for doing that detective work for me! I myself thought 
> the inflation factors reported by check_collinearity() were suspiciously 
> high, but unlike you I lacked the expertise to identify what was going on.
> 
> As for your suggested approach, have I understood this correctly:
> 
> Since there doesn't yet exist an R function that will calculate the 
> (G)VIFS of multinomial models correctly, my best bet for now is just to 
> ignore the fact that such models partition the data into C-1 subsets, 
> and to calculate approximate GVIFs from the entire dataset at once as if 
> the response were continuous? And a simple way to do this is to 
> construct a fake continuous response, call *lm(fakeresponse ~.)*, and 
> apply *car::vif()* on the result?

No, you misunderstand my suggestion, which perhaps isn't surprising 
given the length of my message. What you propose is what I suggested as 
a rough approximation *before* I confirmed that my guess of the solution 
was correct.

The R code that I sent yesterday showed how to compute the GVIF for a 
multinomial regression model, and I suggested that you write either a 
script or a simple function to do that. Here's a function that will work 
for a model object that responds to vcov():

GVIF <- function(model, intercepts, term){
   # model: regression model object
   # intercepts: row/column positions of intercepts in the coefficient 
covariance matrix
   # term: row/column positions of the coefficients for the focal term
   V <- vcov(model)
   term <- colnames(V)[term]
   V <- V[-intercepts, -intercepts]
   V <- cov2cor(V)
   term <- which(colnames(V) %in% term)
   gvif <- det(V[term, term])*det(V[-term, -term])/det(V)
   c(GVIF=gvif, "GVIF^(1/(2*p))"=gvif^(1/(2*length(term))))
}

and here's an application to the multinom() example that I showed you 
yesterday:

 > colnames(vcov(m)) # to get coefficient positions
  [1] "Labour:(Intercept)"                       "Labour:age" 

  [3] "Labour:economic.cond.national" 
"Labour:economic.cond.household"
  [5] "Labour:Blair"                             "Labour:Hague" 

  [7] "Labour:Kennedy"                           "Labour:Europe" 

  [9] "Labour:political.knowledge"               "Labour:gendermale" 

[11] "Liberal Democrat:(Intercept)"             "Liberal Democrat:age" 

[13] "Liberal Democrat:economic.cond.national"  "Liberal 
Democrat:economic.cond.household"
[15] "Liberal Democrat:Blair"                   "Liberal Democrat:Hague" 

[17] "Liberal Democrat:Kennedy"                 "Liberal 
Democrat:Europe"
[19] "Liberal Democrat:political.knowledge"     "Liberal 
Democrat:gendermale"

 > GVIF(m, intercepts=c(1, 11), term=c(2, 12)) # GVIF for age
           GVIF GVIF^(1/(2*p))
       1.046232       1.011363

Finally, here's what you get for a linear model with the same RHS (where 
the sqrt(VIF) should be a rough approximation to GVIF^(1/4) reported by 
my GVIF() function):

 > m.lm <- lm(as.numeric(vote) ~ . - vote1, data=BEPS)
 > sqrt(car::vif(m.lm))
                     age  economic.cond.national economic.cond.household 
                   Blair
                1.006508                1.124132                1.075656 
                1.118441
                   Hague                 Kennedy                  Europe 
     political.knowledge
                1.066799                1.015532                1.101741 
                1.028546
                  gender
                1.017386

John

> 
> Best,
> 
> Juho
> 
> ma 28. helmik. 2022 klo 2.23 John Fox (jfox using mcmaster.ca 
> <mailto:jfox using mcmaster.ca>) kirjoitti:
> 
>     Dear Juho,
> 
>     I've now had a chance to think about this problem some more, and I
>     believe that the approach I suggested is correct. I also had an
>     opportunity to talk the problem over a bit with Georges Monette, who
>     coauthored the paper that introduced generalized variance inflation
>     factors (GVIFs). On the other hand, the results produced by
>     performance::check_collinearity() for multinomial logit models don't
>     seem to be correct (see below).
> 
>     Here's an example, using the nnet::multinom() function to fit a
>     multinomial logit model, with alternative parametrizations of the
>     LHS of
>     the model:
> 
>     --------- snip -----------
> 
>       > library(nnet) # for multinom()
>       > library(carData) # for BEPS data set
> 
>       > # alternative ordering of the response levels:
>       > BEPS$vote1 <- factor(BEPS$vote, levels=c("Labour", "Liberal
>     Democrat", "Conservative"))
>       > levels(BEPS$vote)
>     [1] "Conservative"     "Labour"           "Liberal Democrat"
>       > levels(BEPS$vote1)
>     [1] "Labour"           "Liberal Democrat" "Conservative"
> 
>       > m <- multinom(vote ~ . - vote1, data=BEPS)
>     # weights:  33 (20 variable)
>     initial  value 1675.383740
>     iter  10 value 1345.935273
>     iter  20 value 1150.956807
>     iter  30 value 1141.921662
>     iter  30 value 1141.921661
>     iter  30 value 1141.921661
>     final  value 1141.921661
>     converged
>       > m1 <- multinom(vote1 ~ . - vote, data=BEPS)
>     # weights:  33 (20 variable)
>     initial  value 1675.383740
>     iter  10 value 1280.439304
>     iter  20 value 1165.513772
>     final  value 1141.921662
>     converged
> 
>       > rbind(coef(m), coef(m1)) # compare coefficients
>                        (Intercept)          age economic.cond.national
>     economic.cond.household
>     Labour             0.9515214 -0.021913989              0.5575707
>            0.15839096
>     Liberal Democrat   1.4119306 -0.016810735              0.1810761
>           -0.01196664
>     Liberal Democrat   0.4604567  0.005102666             -0.3764928
>           -0.17036682
>     Conservative      -0.9514466  0.021912305             -0.5575644
>           -0.15838744
>                             Blair       Hague    Kennedy      Europe
>     political.knowledge
>     Labour            0.8371764 -0.90775585  0.2513436 -0.22781308
>     -0.5370612
>     Liberal Democrat  0.2937331 -0.82217625  0.6710567 -0.20004624
>     -0.2034605
>     Liberal Democrat -0.5434408  0.08559455  0.4197027  0.02776465
>     0.3336068
>     Conservative     -0.8371670  0.90778068 -0.2513735  0.22781092
>     0.5370545
>                         gendermale
>     Labour            0.13765774
>     Liberal Democrat  0.12640823
>     Liberal Democrat -0.01125898
>     Conservative     -0.13764849
> 
>       > c(logLik(m), logLik(m1)) # same fit to the data
>     [1] -1141.922 -1141.922
> 
>       > # covariance matrices for coefficients:
>       > V <- vcov(m)
>       > V1 <- vcov(m1)
>       > cbind(colnames(V), colnames(V1)) # compare
>             [,1]                                       [,2]
> 
>        [1,] "Labour:(Intercept)"                       "Liberal
>     Democrat:(Intercept)"
>        [2,] "Labour:age"                               "Liberal
>     Democrat:age"
> 
>        [3,] "Labour:economic.cond.national"            "Liberal
>     Democrat:economic.cond.national"
>        [4,] "Labour:economic.cond.household"           "Liberal
>     Democrat:economic.cond.household"
>        [5,] "Labour:Blair"                             "Liberal
>     Democrat:Blair"
>        [6,] "Labour:Hague"                             "Liberal
>     Democrat:Hague"
>        [7,] "Labour:Kennedy"                           "Liberal
>     Democrat:Kennedy"
>        [8,] "Labour:Europe"                            "Liberal
>     Democrat:Europe"
>        [9,] "Labour:political.knowledge"               "Liberal
>     Democrat:political.knowledge"
>     [10,] "Labour:gendermale"                        "Liberal
>     Democrat:gendermale"
>     [11,] "Liberal Democrat:(Intercept)"
>     "Conservative:(Intercept)"
>     [12,] "Liberal Democrat:age"                     "Conservative:age"
> 
>     [13,] "Liberal Democrat:economic.cond.national"
>     "Conservative:economic.cond.national"
>     [14,] "Liberal Democrat:economic.cond.household"
>     "Conservative:economic.cond.household"
>     [15,] "Liberal Democrat:Blair"                   "Conservative:Blair"
> 
>     [16,] "Liberal Democrat:Hague"                   "Conservative:Hague"
> 
>     [17,] "Liberal Democrat:Kennedy"                 "Conservative:Kennedy"
> 
>     [18,] "Liberal Democrat:Europe"                  "Conservative:Europe"
> 
>     [19,] "Liberal Democrat:political.knowledge"
>     "Conservative:political.knowledge"
>     [20,] "Liberal Democrat:gendermale"
>     "Conservative:gendermale"
> 
>       > int <- c(1, 11) # remove intercepts
>       > colnames(V)[int]
>     [1] "Labour:(Intercept)"           "Liberal Democrat:(Intercept)"
> 
>       > colnames(V1)[int]
>     [1] "Liberal Democrat:(Intercept)" "Conservative:(Intercept)"
>       > V <- V[-int, -int]
>       > V1 <- V1[-int, -int]
> 
>       > age <- c(1, 10) # locate age coefficients
>       > colnames(V)[age]
>     [1] "Labour:age"           "Liberal Democrat:age"
>       > colnames(V1)[age]
>     [1] "Liberal Democrat:age" "Conservative:age"
> 
>       > V <- cov2cor(V) # compute coefficient correlations
>       > V1 <- cov2cor(V1)
> 
>       > # compare GVIFs:
>       > c(det(V[age, age])*det(V[-age, -age])/det(V),
>     +   det(V1[age, age])*det(V1[-age, -age])/det(V1))
>     [1] 1.046232 1.046229
> 
>     --------- snip -----------
> 
>     For curiosity, I applied car::vif() and
>     performance::check_collinearity() to these models to see what they
>     would
>     do. Both returned the wrong answer. vif() produced a warning, but
>     check_collinearity() didn't:
> 
>     --------- snip -----------
> 
>       > car::vif(m1)
>                           age  economic.cond.national
>     economic.cond.household
>                     15.461045               22.137772             
>       16.693877
>                         Blair                   Hague               
>       Kennedy
>                     14.681562                7.483039             
>       15.812067
>                        Europe     political.knowledge                 
>     gender
>                      6.502119                4.219507               
>     2.313885
>     Warning message:
>     In vif.default(m1) : No intercept: vifs may not be sensible.
> 
>       > performance::check_collinearity(m)
>     # Check for Multicollinearity
> 
>     Low Correlation
> 
>                           Term  VIF Increased SE Tolerance
>                            age 1.72         1.31      0.58
>         economic.cond.national 1.85         1.36      0.54
>        economic.cond.household 1.86         1.37      0.54
>                          Blair 1.63         1.28      0.61
>                          Hague 1.94         1.39      0.52
>                        Kennedy 1.70         1.30      0.59
>                         Europe 2.01         1.42      0.50
>            political.knowledge 1.94         1.39      0.52
>                         gender 1.78         1.33      0.56
>       > performance::check_collinearity(m1)
>     # Check for Multicollinearity
> 
>     Low Correlation
> 
>                           Term  VIF Increased SE Tolerance
>                            age 1.19         1.09      0.84
>         economic.cond.national 1.42         1.19      0.70
>        economic.cond.household 1.32         1.15      0.76
>                          Blair 1.50         1.22      0.67
>                          Hague 1.30         1.14      0.77
>                        Kennedy 1.19         1.09      0.84
>                         Europe 1.34         1.16      0.75
>            political.knowledge 1.30         1.14      0.77
>                         gender 1.23         1.11      0.81
> 
>     --------- snip -----------
> 
>     I looked at the code for vif() and check_collinearity() to see where
>     they went wrong. Both failed to handle the two intercepts in the model
>     correctly -- vif() thought there was no intercept and
>     check_collinearity() just removed the first intercept but not the
>     second.
> 
>     In examining the code for check_collinearity(), I discovered a
>     couple of
>     additional disconcerting facts. First, part of the code seems to be
>     copied from vif.default(). Second, as a consequence,
>     check_collinearity() actually computes GVIFs rather than VIFs (and
>     doesn't reference either the Fox and Monette paper introducing GVIFs or
>     the car package) but doesn't seem to understand that, and, for example,
>     takes the squareroot of the GVIF (reported in the column marked
>     "Increased SE") rather than the 2p root (when there are p > 1
>     coefficients in a term).
> 
>     Here's the relevant code from the two functions (where . . . denotes
>     elided lines) -- the default method for vif() and
>     .check_collinearity(),
>     which is called by check_collinearity.default():
> 
>     --------- snip -----------
> 
>       > car:::vif.default
>     function (mod, ...)
>     {
>           . . .
>           v <- vcov(mod)
>           assign <- attr(model.matrix(mod), "assign")
>           if (names(coefficients(mod)[1]) == "(Intercept)") {
>               v <- v[-1, -1]
>               assign <- assign[-1]
>           }
>           else warning("No intercept: vifs may not be sensible.")
>           terms <- labels(terms(mod))
>           n.terms <- length(terms)
>           if (n.terms < 2)
>               stop("model contains fewer than 2 terms")
>           R <- cov2cor(v)
>           detR <- det(R)
>           . . .
>           for (term in 1:n.terms) {
>               subs <- which(assign == term)
>               result[term, 1] <- det(as.matrix(R[subs, subs])) *
>     det(as.matrix(R[-subs,
>                   -subs]))/detR
>               result[term, 2] <- length(subs)
>           }
>           . . .
>     }
> 
>       > performance:::.check_collinearity
>     function (x, component, verbose = TRUE)
>     {
>           v <- insight::get_varcov(x, component = component, verbose =
>     FALSE)
>           assign <- .term_assignments(x, component, verbose = verbose)
>           . . .
>           if (insight::has_intercept(x)) {
>               v <- v[-1, -1]
>               assign <- assign[-1]
>           }
>           else {
>               if (isTRUE(verbose)) {
>                   warning("Model has no intercept. VIFs may not be
>     sensible.",
>                       call. = FALSE)
>               }
>           }
>               . . .
>               terms <- labels(stats::terms(f[[component]]))
>               . . .
>           n.terms <- length(terms)
>           if (n.terms < 2) {
>               if (isTRUE(verbose)) {
>                   warning(insight::format_message(sprintf("Not enough model
>     terms in the %s part of the model to check for multicollinearity.",
>                       component)), call. = FALSE)
>               }
>               return(NULL)
>           }
>           R <- stats::cov2cor(v)
>           detR <- det(R)
>           . . .
>           for (term in 1:n.terms) {
>               subs <- which(assign == term)
>                   . . .
>                   result <- c(result, det(as.matrix(R[subs, subs])) *
>                       det(as.matrix(R[-subs, -subs]))/detR)
>                   . . .
>           }
>           . . .
>     }
> 
>     --------- snip -----------
> 
>     So, the upshot of all this is that you should be able to do what you
>     want, but not with either car::vif() or
>     performance::check_collinearity(). Instead, either write your own
>     function or do the computations in a script.
> 
>     There's also a lesson here about S3 default methods: The fact that a
>     default method returns a result rather than throwing an error or a
>     warning doesn't mean that the result is the right answer.
> 
>     I hope this helps,
>        John
> 
> 
>     On 2022-02-26 3:45 p.m., Juho Kristian Ruohonen wrote:
>      > Dear John W,
>      >
>      > Thank you very much for the tip-off! Apologies for not responding
>     earlier
>      > (gmail apparently decided to direct your email right into the
>     junk folder).
>      > I am very pleased to note that the package you mention does
>     indeed work
>      > with *brms* multinomial models! Thanks again!
>      >
>      > Best,
>      >
>      > Juho
>      >
>      > pe 25. helmik. 2022 klo 19.23 John Willoughby
>     (johnwillec using gmail.com <mailto:johnwillec using gmail.com>)
>      > kirjoitti:
>      >
>      >> Have you tried the check_collinearity() function in the performance
>      >> package? It's supposed to work on brms models, but whether it
>     will work on
>      >> a multinomial model I don't know.  It works well on mixed models
>     generated
>      >> by glmmTMB().
>      >>
>      >> John Willoughby
>      >>
>      >>
>      >> On Fri, Feb 25, 2022 at 3:01 AM
>     <r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>>
>      >> wrote:
>      >>
>      >>> Send R-sig-mixed-models mailing list submissions to
>      >>> r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >>>
>      >>> To subscribe or unsubscribe via the World Wide Web, visit
>      >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >>> or, via email, send a message with subject or body 'help' to
>      >>> r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>
>      >>>
>      >>> You can reach the person managing the list at
>      >>> r-sig-mixed-models-owner using r-project.org
>     <mailto:r-sig-mixed-models-owner using r-project.org>
>      >>>
>      >>> When replying, please edit your Subject line so it is more specific
>      >>> than "Re: Contents of R-sig-mixed-models digest..."
>      >>>
>      >>>
>      >>> Today's Topics:
>      >>>
>      >>>     1. Collinearity diagnostics for (mixed) multinomial models
>      >>>        (Juho Kristian Ruohonen)
>      >>>
>      >>>
>     ----------------------------------------------------------------------
>      >>>
>      >>> Message: 1
>      >>> Date: Fri, 25 Feb 2022 10:23:25 +0200
>      >>> From: Juho Kristian Ruohonen <juho.kristian.ruohonen using gmail.com
>     <mailto:juho.kristian.ruohonen using gmail.com>>
>      >>> To: John Fox <jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>
>      >>> Cc: "r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>"
>      >>>          <r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>
>      >>> Subject: [R-sig-ME] Collinearity diagnostics for (mixed)
>     multinomial
>      >>>          models
>      >>> Message-ID:
>      >>>          <
>      >>>
>     CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
>     <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>>
>      >>> Content-Type: text/plain; charset="utf-8"
>      >>>
>      >>> Dear John (and anyone else qualified to comment),
>      >>>
>      >>> I fit lots of mixed-effects multinomial models in my research,
>     and I
>      >> would
>      >>> like to see some (multi)collinearity diagnostics on the fixed
>     effects, of
>      >>> which there are over 30. My models are fit using the Bayesian
>     *brms*
>      >>> package because I know of no frequentist packages with
>     multinomial GLMM
>      >>> compatibility.
>      >>>
>      >>> With continuous or dichotomous outcomes, my go-to function for
>      >> calculating
>      >>> multicollinearity diagnostics is of course *vif()* from the *car*
>      >> package.
>      >>> As expected, however, this function does not report sensible
>     diagnostics
>      >>> for multinomial models -- not even for standard ones fit by the
>     *nnet*
>      >>> package's *multinom()* function. The reason, I presume, is
>     because a
>      >>> multinomial model is not really one but C-1 regression models 
>     (where C
>      >> is
>      >>> the number of response categories) and the *vif()* function is not
>      >> designed
>      >>> to deal with this scenario.
>      >>>
>      >>> Therefore, in order to obtain meaningful collinearity metrics,
>     my present
>      >>> plan is to write a simple helper function that uses *vif() *to
>     calculate
>      >>> and present (generalized) variance inflation metrics for the C-1
>      >>> sub-datasets to which the C-1 component binomial models of the
>     overall
>      >>> multinomial model are fit. In other words, it will partition
>     the data
>      >> into
>      >>> those C-1 subsets, and then apply *vif()* to as many linear
>     regressions
>      >>> using a made-up continuous response and the fixed effects of
>     interest.
>      >>>
>      >>> Does this seem like a sensible approach?
>      >>>
>      >>> Best,
>      >>>
>      >>> Juho
>      >>>
>      >>>
>      >>>
>      >>
>      >>          [[alternative HTML version deleted]]
>      >>
>      >> _______________________________________________
>      >> R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org> mailing list
>      >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >>
>      >
>      >       [[alternative HTML version deleted]]
>      >
>      > _______________________________________________
>      > R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org> mailing list
>      > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>     -- 
>     John Fox, Professor Emeritus
>     McMaster University
>     Hamilton, Ontario, Canada
>     web: https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>
> 
-- 
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/