[R-sig-ME] Collinearity diagnostics for (mixed) multinomial models

Mon Feb 28 23:00:26 CET 2022

Apologies for my misreading, John, and many thanks for showing how the
calculation is done for a single term.

Do you think *vif()* might be updated in the near future with the
capability of auto-detecting a multinomial model and returning
mathematically correct GVIF statistics?

If not, I'll proceed to writing my own function based on your example.
However, *car* is such an excellent and widely used package that the
greatest benefit to mankind would probably accrue if *car *was upgraded
with this feature sooner rather than later.

Best,

Juho

ma 28. helmik. 2022 klo 17.08 John Fox (jfox using mcmaster.ca) kirjoitti:

> Dear Juho,
>
> On 2022-02-28 2:06 a.m., Juho Kristian Ruohonen wrote:
> > Dear Professor Fox and other list members,
> >
> > Profuse thanks for doing that detective work for me! I myself thought
> > the inflation factors reported by check_collinearity() were suspiciously
> > high, but unlike you I lacked the expertise to identify what was going
> on.
> >
> > As for your suggested approach, have I understood this correctly:
> >
> > Since there doesn't yet exist an R function that will calculate the
> > (G)VIFS of multinomial models correctly, my best bet for now is just to
> > ignore the fact that such models partition the data into C-1 subsets,
> > and to calculate approximate GVIFs from the entire dataset at once as if
> > the response were continuous? And a simple way to do this is to
> > construct a fake continuous response, call *lm(fakeresponse ~.)*, and
> > apply *car::vif()* on the result?
>
> No, you misunderstand my suggestion, which perhaps isn't surprising
> given the length of my message. What you propose is what I suggested as
> a rough approximation *before* I confirmed that my guess of the solution
> was correct.
>
> The R code that I sent yesterday showed how to compute the GVIF for a
> multinomial regression model, and I suggested that you write either a
> script or a simple function to do that. Here's a function that will work
> for a model object that responds to vcov():
>
> GVIF <- function(model, intercepts, term){
>    # model: regression model object
>    # intercepts: row/column positions of intercepts in the coefficient
> covariance matrix
>    # term: row/column positions of the coefficients for the focal term
>    V <- vcov(model)
>    term <- colnames(V)[term]
>    V <- V[-intercepts, -intercepts]
>    V <- cov2cor(V)
>    term <- which(colnames(V) %in% term)
>    gvif <- det(V[term, term])*det(V[-term, -term])/det(V)
>    c(GVIF=gvif, "GVIF^(1/(2*p))"=gvif^(1/(2*length(term))))
> }
>
> and here's an application to the multinom() example that I showed you
> yesterday:
>
>  > colnames(vcov(m)) # to get coefficient positions
>   [1] "Labour:(Intercept)"                       "Labour:age"
>
>   [3] "Labour:economic.cond.national"
> "Labour:economic.cond.household"
>   [5] "Labour:Blair"                             "Labour:Hague"
>
>   [7] "Labour:Kennedy"                           "Labour:Europe"
>
>   [9] "Labour:political.knowledge"               "Labour:gendermale"
>
> [11] "Liberal Democrat:(Intercept)"             "Liberal Democrat:age"
>
> [13] "Liberal Democrat:economic.cond.national"  "Liberal
> Democrat:economic.cond.household"
> [15] "Liberal Democrat:Blair"                   "Liberal Democrat:Hague"
>
> [17] "Liberal Democrat:Kennedy"                 "Liberal
> Democrat:Europe"
> [19] "Liberal Democrat:political.knowledge"     "Liberal
> Democrat:gendermale"
>
>  > GVIF(m, intercepts=c(1, 11), term=c(2, 12)) # GVIF for age
>            GVIF GVIF^(1/(2*p))
>        1.046232       1.011363
>
>
> Finally, here's what you get for a linear model with the same RHS (where
> the sqrt(VIF) should be a rough approximation to GVIF^(1/4) reported by
> my GVIF() function):
>
>  > m.lm <- lm(as.numeric(vote) ~ . - vote1, data=BEPS)
>  > sqrt(car::vif(m.lm))
>                      age  economic.cond.national economic.cond.household
>                    Blair
>                 1.006508                1.124132                1.075656
>                 1.118441
>                    Hague                 Kennedy                  Europe
>      political.knowledge
>                 1.066799                1.015532                1.101741
>                 1.028546
>                   gender
>                 1.017386
>
>
> John
>
> >
> > Best,
> >
> > Juho
> >
> > ma 28. helmik. 2022 klo 2.23 John Fox (jfox using mcmaster.ca
> > <mailto:jfox using mcmaster.ca>) kirjoitti:
> >
> >     Dear Juho,
> >
> >     I've now had a chance to think about this problem some more, and I
> >     believe that the approach I suggested is correct. I also had an
> >     opportunity to talk the problem over a bit with Georges Monette, who
> >     coauthored the paper that introduced generalized variance inflation
> >     factors (GVIFs). On the other hand, the results produced by
> >     performance::check_collinearity() for multinomial logit models don't
> >     seem to be correct (see below).
> >
> >     Here's an example, using the nnet::multinom() function to fit a
> >     multinomial logit model, with alternative parametrizations of the
> >     LHS of
> >     the model:
> >
> >     --------- snip -----------
> >
> >       > library(nnet) # for multinom()
> >       > library(carData) # for BEPS data set
> >
> >       > # alternative ordering of the response levels:
> >       > BEPS$vote1 <- factor(BEPS$vote, levels=c("Labour", "Liberal
> >     Democrat", "Conservative"))
> >       > levels(BEPS$vote)
> >     [1] "Conservative"     "Labour"           "Liberal Democrat"
> >       > levels(BEPS$vote1)
> >     [1] "Labour"           "Liberal Democrat" "Conservative"
> >
> >       > m <- multinom(vote ~ . - vote1, data=BEPS)
> >     # weights:  33 (20 variable)
> >     initial  value 1675.383740
> >     iter  10 value 1345.935273
> >     iter  20 value 1150.956807
> >     iter  30 value 1141.921662
> >     iter  30 value 1141.921661
> >     iter  30 value 1141.921661
> >     final  value 1141.921661
> >     converged
> >       > m1 <- multinom(vote1 ~ . - vote, data=BEPS)
> >     # weights:  33 (20 variable)
> >     initial  value 1675.383740
> >     iter  10 value 1280.439304
> >     iter  20 value 1165.513772
> >     final  value 1141.921662
> >     converged
> >
> >       > rbind(coef(m), coef(m1)) # compare coefficients
> >                        (Intercept)          age economic.cond.national
> >     economic.cond.household
> >     Labour             0.9515214 -0.021913989              0.5575707
> >            0.15839096
> >     Liberal Democrat   1.4119306 -0.016810735              0.1810761
> >           -0.01196664
> >     Liberal Democrat   0.4604567  0.005102666             -0.3764928
> >           -0.17036682
> >     Conservative      -0.9514466  0.021912305             -0.5575644
> >           -0.15838744
> >                             Blair       Hague    Kennedy      Europe
> >     political.knowledge
> >     Labour            0.8371764 -0.90775585  0.2513436 -0.22781308
> >     -0.5370612
> >     Liberal Democrat  0.2937331 -0.82217625  0.6710567 -0.20004624
> >     -0.2034605
> >     Liberal Democrat -0.5434408  0.08559455  0.4197027  0.02776465
> >     0.3336068
> >     Conservative     -0.8371670  0.90778068 -0.2513735  0.22781092
> >     0.5370545
> >                         gendermale
> >     Labour            0.13765774
> >     Liberal Democrat  0.12640823
> >     Liberal Democrat -0.01125898
> >     Conservative     -0.13764849
> >
> >       > c(logLik(m), logLik(m1)) # same fit to the data
> >     [1] -1141.922 -1141.922
> >
> >       > # covariance matrices for coefficients:
> >       > V <- vcov(m)
> >       > V1 <- vcov(m1)
> >       > cbind(colnames(V), colnames(V1)) # compare
> >             [,1]                                       [,2]
> >
> >        [1,] "Labour:(Intercept)"                       "Liberal
> >     Democrat:(Intercept)"
> >        [2,] "Labour:age"                               "Liberal
> >     Democrat:age"
> >
> >        [3,] "Labour:economic.cond.national"            "Liberal
> >     Democrat:economic.cond.national"
> >        [4,] "Labour:economic.cond.household"           "Liberal
> >     Democrat:economic.cond.household"
> >        [5,] "Labour:Blair"                             "Liberal
> >     Democrat:Blair"
> >        [6,] "Labour:Hague"                             "Liberal
> >     Democrat:Hague"
> >        [7,] "Labour:Kennedy"                           "Liberal
> >     Democrat:Kennedy"
> >        [8,] "Labour:Europe"                            "Liberal
> >     Democrat:Europe"
> >        [9,] "Labour:political.knowledge"               "Liberal
> >     Democrat:political.knowledge"
> >     [10,] "Labour:gendermale"                        "Liberal
> >     Democrat:gendermale"
> >     [11,] "Liberal Democrat:(Intercept)"
> >     "Conservative:(Intercept)"
> >     [12,] "Liberal Democrat:age"                     "Conservative:age"
> >
> >     [13,] "Liberal Democrat:economic.cond.national"
> >     "Conservative:economic.cond.national"
> >     [14,] "Liberal Democrat:economic.cond.household"
> >     "Conservative:economic.cond.household"
> >     [15,] "Liberal Democrat:Blair"                   "Conservative:Blair"
> >
> >     [16,] "Liberal Democrat:Hague"                   "Conservative:Hague"
> >
> >     [17,] "Liberal Democrat:Kennedy"
>  "Conservative:Kennedy"
> >
> >     [18,] "Liberal Democrat:Europe"
> "Conservative:Europe"
> >
> >     [19,] "Liberal Democrat:political.knowledge"
> >     "Conservative:political.knowledge"
> >     [20,] "Liberal Democrat:gendermale"
> >     "Conservative:gendermale"
> >
> >       > int <- c(1, 11) # remove intercepts
> >       > colnames(V)[int]
> >     [1] "Labour:(Intercept)"           "Liberal Democrat:(Intercept)"
> >
> >       > colnames(V1)[int]
> >     [1] "Liberal Democrat:(Intercept)" "Conservative:(Intercept)"
> >       > V <- V[-int, -int]
> >       > V1 <- V1[-int, -int]
> >
> >       > age <- c(1, 10) # locate age coefficients
> >       > colnames(V)[age]
> >     [1] "Labour:age"           "Liberal Democrat:age"
> >       > colnames(V1)[age]
> >     [1] "Liberal Democrat:age" "Conservative:age"
> >
> >       > V <- cov2cor(V) # compute coefficient correlations
> >       > V1 <- cov2cor(V1)
> >
> >       > # compare GVIFs:
> >       > c(det(V[age, age])*det(V[-age, -age])/det(V),
> >     +   det(V1[age, age])*det(V1[-age, -age])/det(V1))
> >     [1] 1.046232 1.046229
> >
> >     --------- snip -----------
> >
> >     For curiosity, I applied car::vif() and
> >     performance::check_collinearity() to these models to see what they
> >     would
> >     do. Both returned the wrong answer. vif() produced a warning, but
> >     check_collinearity() didn't:
> >
> >     --------- snip -----------
> >
> >       > car::vif(m1)
> >                           age  economic.cond.national
> >     economic.cond.household
> >                     15.461045               22.137772
> >       16.693877
> >                         Blair                   Hague
> >       Kennedy
> >                     14.681562                7.483039
> >       15.812067
> >                        Europe     political.knowledge
> >     gender
> >                      6.502119                4.219507
> >     2.313885
> >     Warning message:
> >     In vif.default(m1) : No intercept: vifs may not be sensible.
> >
> >       > performance::check_collinearity(m)
> >     # Check for Multicollinearity
> >
> >     Low Correlation
> >
> >                           Term  VIF Increased SE Tolerance
> >                            age 1.72         1.31      0.58
> >         economic.cond.national 1.85         1.36      0.54
> >        economic.cond.household 1.86         1.37      0.54
> >                          Blair 1.63         1.28      0.61
> >                          Hague 1.94         1.39      0.52
> >                        Kennedy 1.70         1.30      0.59
> >                         Europe 2.01         1.42      0.50
> >            political.knowledge 1.94         1.39      0.52
> >                         gender 1.78         1.33      0.56
> >       > performance::check_collinearity(m1)
> >     # Check for Multicollinearity
> >
> >     Low Correlation
> >
> >                           Term  VIF Increased SE Tolerance
> >                            age 1.19         1.09      0.84
> >         economic.cond.national 1.42         1.19      0.70
> >        economic.cond.household 1.32         1.15      0.76
> >                          Blair 1.50         1.22      0.67
> >                          Hague 1.30         1.14      0.77
> >                        Kennedy 1.19         1.09      0.84
> >                         Europe 1.34         1.16      0.75
> >            political.knowledge 1.30         1.14      0.77
> >                         gender 1.23         1.11      0.81
> >
> >     --------- snip -----------
> >
> >     I looked at the code for vif() and check_collinearity() to see where
> >     they went wrong. Both failed to handle the two intercepts in the
> model
> >     correctly -- vif() thought there was no intercept and
> >     check_collinearity() just removed the first intercept but not the
> >     second.
> >
> >     In examining the code for check_collinearity(), I discovered a
> >     couple of
> >     additional disconcerting facts. First, part of the code seems to be
> >     copied from vif.default(). Second, as a consequence,
> >     check_collinearity() actually computes GVIFs rather than VIFs (and
> >     doesn't reference either the Fox and Monette paper introducing GVIFs
> or
> >     the car package) but doesn't seem to understand that, and, for
> example,
> >     takes the squareroot of the GVIF (reported in the column marked
> >     "Increased SE") rather than the 2p root (when there are p > 1
> >     coefficients in a term).
> >
> >     Here's the relevant code from the two functions (where . . . denotes
> >     elided lines) -- the default method for vif() and
> >     .check_collinearity(),
> >     which is called by check_collinearity.default():
> >
> >     --------- snip -----------
> >
> >       > car:::vif.default
> >     function (mod, ...)
> >     {
> >           . . .
> >           v <- vcov(mod)
> >           assign <- attr(model.matrix(mod), "assign")
> >           if (names(coefficients(mod)[1]) == "(Intercept)") {
> >               v <- v[-1, -1]
> >               assign <- assign[-1]
> >           }
> >           else warning("No intercept: vifs may not be sensible.")
> >           terms <- labels(terms(mod))
> >           n.terms <- length(terms)
> >           if (n.terms < 2)
> >               stop("model contains fewer than 2 terms")
> >           R <- cov2cor(v)
> >           detR <- det(R)
> >           . . .
> >           for (term in 1:n.terms) {
> >               subs <- which(assign == term)
> >               result[term, 1] <- det(as.matrix(R[subs, subs])) *
> >     det(as.matrix(R[-subs,
> >                   -subs]))/detR
> >               result[term, 2] <- length(subs)
> >           }
> >           . . .
> >     }
> >
> >       > performance:::.check_collinearity
> >     function (x, component, verbose = TRUE)
> >     {
> >           v <- insight::get_varcov(x, component = component, verbose =
> >     FALSE)
> >           assign <- .term_assignments(x, component, verbose = verbose)
> >           . . .
> >           if (insight::has_intercept(x)) {
> >               v <- v[-1, -1]
> >               assign <- assign[-1]
> >           }
> >           else {
> >               if (isTRUE(verbose)) {
> >                   warning("Model has no intercept. VIFs may not be
> >     sensible.",
> >                       call. = FALSE)
> >               }
> >           }
> >               . . .
> >               terms <- labels(stats::terms(f[[component]]))
> >               . . .
> >           n.terms <- length(terms)
> >           if (n.terms < 2) {
> >               if (isTRUE(verbose)) {
> >                   warning(insight::format_message(sprintf("Not enough
> model
> >     terms in the %s part of the model to check for multicollinearity.",
> >                       component)), call. = FALSE)
> >               }
> >               return(NULL)
> >           }
> >           R <- stats::cov2cor(v)
> >           detR <- det(R)
> >           . . .
> >           for (term in 1:n.terms) {
> >               subs <- which(assign == term)
> >                   . . .
> >                   result <- c(result, det(as.matrix(R[subs, subs])) *
> >                       det(as.matrix(R[-subs, -subs]))/detR)
> >                   . . .
> >           }
> >           . . .
> >     }
> >
> >     --------- snip -----------
> >
> >     So, the upshot of all this is that you should be able to do what you
> >     want, but not with either car::vif() or
> >     performance::check_collinearity(). Instead, either write your own
> >     function or do the computations in a script.
> >
> >     There's also a lesson here about S3 default methods: The fact that a
> >     default method returns a result rather than throwing an error or a
> >     warning doesn't mean that the result is the right answer.
> >
> >     I hope this helps,
> >        John
> >
> >
> >     On 2022-02-26 3:45 p.m., Juho Kristian Ruohonen wrote:
> >      > Dear John W,
> >      >
> >      > Thank you very much for the tip-off! Apologies for not responding
> >     earlier
> >      > (gmail apparently decided to direct your email right into the
> >     junk folder).
> >      > I am very pleased to note that the package you mention does
> >     indeed work
> >      > with *brms* multinomial models! Thanks again!
> >      >
> >      > Best,
> >      >
> >      > Juho
> >      >
> >      > pe 25. helmik. 2022 klo 19.23 John Willoughby
> >     (johnwillec using gmail.com <mailto:johnwillec using gmail.com>)
> >      > kirjoitti:
> >      >
> >      >> Have you tried the check_collinearity() function in the
> performance
> >      >> package? It's supposed to work on brms models, but whether it
> >     will work on
> >      >> a multinomial model I don't know.  It works well on mixed models
> >     generated
> >      >> by glmmTMB().
> >      >>
> >      >> John Willoughby
> >      >>
> >      >>
> >      >> On Fri, Feb 25, 2022 at 3:01 AM
> >     <r-sig-mixed-models-request using r-project.org
> >     <mailto:r-sig-mixed-models-request using r-project.org>>
> >      >> wrote:
> >      >>
> >      >>> Send R-sig-mixed-models mailing list submissions to
> >      >>> r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>
> >      >>>
> >      >>> To subscribe or unsubscribe via the World Wide Web, visit
> >      >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >      >>> or, via email, send a message with subject or body 'help' to
> >      >>> r-sig-mixed-models-request using r-project.org
> >     <mailto:r-sig-mixed-models-request using r-project.org>
> >      >>>
> >      >>> You can reach the person managing the list at
> >      >>> r-sig-mixed-models-owner using r-project.org
> >     <mailto:r-sig-mixed-models-owner using r-project.org>
> >      >>>
> >      >>> When replying, please edit your Subject line so it is more
> specific
> >      >>> than "Re: Contents of R-sig-mixed-models digest..."
> >      >>>
> >      >>>
> >      >>> Today's Topics:
> >      >>>
> >      >>>     1. Collinearity diagnostics for (mixed) multinomial models
> >      >>>        (Juho Kristian Ruohonen)
> >      >>>
> >      >>>
> >
>  ----------------------------------------------------------------------
> >      >>>
> >      >>> Message: 1
> >      >>> Date: Fri, 25 Feb 2022 10:23:25 +0200
> >      >>> From: Juho Kristian Ruohonen <juho.kristian.ruohonen using gmail.com
> >     <mailto:juho.kristian.ruohonen using gmail.com>>
> >      >>> To: John Fox <jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>
> >      >>> Cc: "r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>"
> >      >>>          <r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>>
> >      >>> Subject: [R-sig-ME] Collinearity diagnostics for (mixed)
> >     multinomial
> >      >>>          models
> >      >>> Message-ID:
> >      >>>          <
> >      >>>
> >     CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
> >     <mailto:
> CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>>
> >      >>> Content-Type: text/plain; charset="utf-8"
> >      >>>
> >      >>> Dear John (and anyone else qualified to comment),
> >      >>>
> >      >>> I fit lots of mixed-effects multinomial models in my research,
> >     and I
> >      >> would
> >      >>> like to see some (multi)collinearity diagnostics on the fixed
> >     effects, of
> >      >>> which there are over 30. My models are fit using the Bayesian
> >     *brms*
> >      >>> package because I know of no frequentist packages with
> >     multinomial GLMM
> >      >>> compatibility.
> >      >>>
> >      >>> With continuous or dichotomous outcomes, my go-to function for
> >      >> calculating
> >      >>> multicollinearity diagnostics is of course *vif()* from the
> *car*
> >      >> package.
> >      >>> As expected, however, this function does not report sensible
> >     diagnostics
> >      >>> for multinomial models -- not even for standard ones fit by the
> >     *nnet*
> >      >>> package's *multinom()* function. The reason, I presume, is
> >     because a
> >      >>> multinomial model is not really one but C-1 regression models
> >     (where C
> >      >> is
> >      >>> the number of response categories) and the *vif()* function is
> not
> >      >> designed
> >      >>> to deal with this scenario.
> >      >>>
> >      >>> Therefore, in order to obtain meaningful collinearity metrics,
> >     my present
> >      >>> plan is to write a simple helper function that uses *vif() *to
> >     calculate
> >      >>> and present (generalized) variance inflation metrics for the C-1
> >      >>> sub-datasets to which the C-1 component binomial models of the
> >     overall
> >      >>> multinomial model are fit. In other words, it will partition
> >     the data
> >      >> into
> >      >>> those C-1 subsets, and then apply *vif()* to as many linear
> >     regressions
> >      >>> using a made-up continuous response and the fixed effects of
> >     interest.
> >      >>>
> >      >>> Does this seem like a sensible approach?
> >      >>>
> >      >>> Best,
> >      >>>
> >      >>> Juho
> >      >>>
> >      >>>
> >      >>>
> >      >>
> >      >>          [[alternative HTML version deleted]]
> >      >>
> >      >> _______________________________________________
> >      >> R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org> mailing list
> >      >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >      >>
> >      >
> >      >       [[alternative HTML version deleted]]
> >      >
> >      > _______________________________________________
> >      > R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org> mailing list
> >      > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >     --
> >     John Fox, Professor Emeritus
> >     McMaster University
> >     Hamilton, Ontario, Canada
> >     web: https://socialsciences.mcmaster.ca/jfox/
> >     <https://socialsciences.mcmaster.ca/jfox/>
> >
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
>

	[[alternative HTML version deleted]]