[R-sig-ME] Collinearity diagnostics for (mixed) multinomial models

John Fox j|ox @end|ng |rom mcm@@ter@c@
Mon Feb 28 23:05:45 CET 2022


Dear Juha,

On 2022-02-28 5:00 p.m., Juho Kristian Ruohonen wrote:
> Apologies for my misreading, John, and many thanks for showing how the 
> calculation is done for a single term.
> 
> Do you think *vif()* might be updated in the near future with the 
> capability of auto-detecting a multinomial model and returning 
> mathematically correct GVIF statistics?

The thought crossed my mind, but I'd want to do it in a general way, not 
just for the multinom() function, and in a way that avoids incorrect 
results such as those currently produced for "multinom" models, albeit 
with a warning. I can't guarantee whether or when I'll be able to do that.

John

> 
> If not, I'll proceed to writing my own function based on your example. 
> However, /car/ is such an excellent and widely used package that the 
> greatest benefit to mankind would probably accrue if /car /was upgraded 
> with this feature sooner rather than later.
> 
> Best,
> 
> Juho
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ma 28. helmik. 2022 klo 17.08 John Fox (jfox using mcmaster.ca 
> <mailto:jfox using mcmaster.ca>) kirjoitti:
> 
>     Dear Juho,
> 
>     On 2022-02-28 2:06 a.m., Juho Kristian Ruohonen wrote:
>      > Dear Professor Fox and other list members,
>      >
>      > Profuse thanks for doing that detective work for me! I myself
>     thought
>      > the inflation factors reported by check_collinearity() were
>     suspiciously
>      > high, but unlike you I lacked the expertise to identify what was
>     going on.
>      >
>      > As for your suggested approach, have I understood this correctly:
>      >
>      > Since there doesn't yet exist an R function that will calculate the
>      > (G)VIFS of multinomial models correctly, my best bet for now is
>     just to
>      > ignore the fact that such models partition the data into C-1
>     subsets,
>      > and to calculate approximate GVIFs from the entire dataset at
>     once as if
>      > the response were continuous? And a simple way to do this is to
>      > construct a fake continuous response, call *lm(fakeresponse ~.)*,
>     and
>      > apply *car::vif()* on the result?
> 
>     No, you misunderstand my suggestion, which perhaps isn't surprising
>     given the length of my message. What you propose is what I suggested as
>     a rough approximation *before* I confirmed that my guess of the
>     solution
>     was correct.
> 
>     The R code that I sent yesterday showed how to compute the GVIF for a
>     multinomial regression model, and I suggested that you write either a
>     script or a simple function to do that. Here's a function that will
>     work
>     for a model object that responds to vcov():
> 
>     GVIF <- function(model, intercepts, term){
>         # model: regression model object
>         # intercepts: row/column positions of intercepts in the coefficient
>     covariance matrix
>         # term: row/column positions of the coefficients for the focal term
>         V <- vcov(model)
>         term <- colnames(V)[term]
>         V <- V[-intercepts, -intercepts]
>         V <- cov2cor(V)
>         term <- which(colnames(V) %in% term)
>         gvif <- det(V[term, term])*det(V[-term, -term])/det(V)
>         c(GVIF=gvif, "GVIF^(1/(2*p))"=gvif^(1/(2*length(term))))
>     }
> 
>     and here's an application to the multinom() example that I showed you
>     yesterday:
> 
>       > colnames(vcov(m)) # to get coefficient positions
>        [1] "Labour:(Intercept)"                       "Labour:age"
> 
>        [3] "Labour:economic.cond.national"
>     "Labour:economic.cond.household"
>        [5] "Labour:Blair"                             "Labour:Hague"
> 
>        [7] "Labour:Kennedy"                           "Labour:Europe"
> 
>        [9] "Labour:political.knowledge"               "Labour:gendermale"
> 
>     [11] "Liberal Democrat:(Intercept)"             "Liberal Democrat:age"
> 
>     [13] "Liberal Democrat:economic.cond.national"  "Liberal
>     Democrat:economic.cond.household"
>     [15] "Liberal Democrat:Blair"                   "Liberal
>     Democrat:Hague"
> 
>     [17] "Liberal Democrat:Kennedy"                 "Liberal
>     Democrat:Europe"
>     [19] "Liberal Democrat:political.knowledge"     "Liberal
>     Democrat:gendermale"
> 
>       > GVIF(m, intercepts=c(1, 11), term=c(2, 12)) # GVIF for age
>                 GVIF GVIF^(1/(2*p))
>             1.046232       1.011363
> 
> 
>     Finally, here's what you get for a linear model with the same RHS
>     (where
>     the sqrt(VIF) should be a rough approximation to GVIF^(1/4) reported by
>     my GVIF() function):
> 
>       > m.lm <- lm(as.numeric(vote) ~ . - vote1, data=BEPS)
>       > sqrt(car::vif(m.lm))
>                           age  economic.cond.national
>     economic.cond.household
>                         Blair
>                      1.006508                1.124132               
>     1.075656
>                      1.118441
>                         Hague                 Kennedy                 
>     Europe
>           political.knowledge
>                      1.066799                1.015532               
>     1.101741
>                      1.028546
>                        gender
>                      1.017386
> 
> 
>     John
> 
>      >
>      > Best,
>      >
>      > Juho
>      >
>      > ma 28. helmik. 2022 klo 2.23 John Fox (jfox using mcmaster.ca
>     <mailto:jfox using mcmaster.ca>
>      > <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>) kirjoitti:
>      >
>      >     Dear Juho,
>      >
>      >     I've now had a chance to think about this problem some more,
>     and I
>      >     believe that the approach I suggested is correct. I also had an
>      >     opportunity to talk the problem over a bit with Georges
>     Monette, who
>      >     coauthored the paper that introduced generalized variance
>     inflation
>      >     factors (GVIFs). On the other hand, the results produced by
>      >     performance::check_collinearity() for multinomial logit
>     models don't
>      >     seem to be correct (see below).
>      >
>      >     Here's an example, using the nnet::multinom() function to fit a
>      >     multinomial logit model, with alternative parametrizations of the
>      >     LHS of
>      >     the model:
>      >
>      >     --------- snip -----------
>      >
>      >       > library(nnet) # for multinom()
>      >       > library(carData) # for BEPS data set
>      >
>      >       > # alternative ordering of the response levels:
>      >       > BEPS$vote1 <- factor(BEPS$vote, levels=c("Labour", "Liberal
>      >     Democrat", "Conservative"))
>      >       > levels(BEPS$vote)
>      >     [1] "Conservative"     "Labour"           "Liberal Democrat"
>      >       > levels(BEPS$vote1)
>      >     [1] "Labour"           "Liberal Democrat" "Conservative"
>      >
>      >       > m <- multinom(vote ~ . - vote1, data=BEPS)
>      >     # weights:  33 (20 variable)
>      >     initial  value 1675.383740
>      >     iter  10 value 1345.935273
>      >     iter  20 value 1150.956807
>      >     iter  30 value 1141.921662
>      >     iter  30 value 1141.921661
>      >     iter  30 value 1141.921661
>      >     final  value 1141.921661
>      >     converged
>      >       > m1 <- multinom(vote1 ~ . - vote, data=BEPS)
>      >     # weights:  33 (20 variable)
>      >     initial  value 1675.383740
>      >     iter  10 value 1280.439304
>      >     iter  20 value 1165.513772
>      >     final  value 1141.921662
>      >     converged
>      >
>      >       > rbind(coef(m), coef(m1)) # compare coefficients
>      >                        (Intercept)          age
>     economic.cond.national
>      >     economic.cond.household
>      >     Labour             0.9515214 -0.021913989              0.5575707
>      >            0.15839096
>      >     Liberal Democrat   1.4119306 -0.016810735              0.1810761
>      >           -0.01196664
>      >     Liberal Democrat   0.4604567  0.005102666             -0.3764928
>      >           -0.17036682
>      >     Conservative      -0.9514466  0.021912305             -0.5575644
>      >           -0.15838744
>      >                             Blair       Hague    Kennedy      Europe
>      >     political.knowledge
>      >     Labour            0.8371764 -0.90775585  0.2513436 -0.22781308
>      >     -0.5370612
>      >     Liberal Democrat  0.2937331 -0.82217625  0.6710567 -0.20004624
>      >     -0.2034605
>      >     Liberal Democrat -0.5434408  0.08559455  0.4197027  0.02776465
>      >     0.3336068
>      >     Conservative     -0.8371670  0.90778068 -0.2513735  0.22781092
>      >     0.5370545
>      >                         gendermale
>      >     Labour            0.13765774
>      >     Liberal Democrat  0.12640823
>      >     Liberal Democrat -0.01125898
>      >     Conservative     -0.13764849
>      >
>      >       > c(logLik(m), logLik(m1)) # same fit to the data
>      >     [1] -1141.922 -1141.922
>      >
>      >       > # covariance matrices for coefficients:
>      >       > V <- vcov(m)
>      >       > V1 <- vcov(m1)
>      >       > cbind(colnames(V), colnames(V1)) # compare
>      >             [,1]                                       [,2]
>      >
>      >        [1,] "Labour:(Intercept)"                       "Liberal
>      >     Democrat:(Intercept)"
>      >        [2,] "Labour:age"                               "Liberal
>      >     Democrat:age"
>      >
>      >        [3,] "Labour:economic.cond.national"            "Liberal
>      >     Democrat:economic.cond.national"
>      >        [4,] "Labour:economic.cond.household"           "Liberal
>      >     Democrat:economic.cond.household"
>      >        [5,] "Labour:Blair"                             "Liberal
>      >     Democrat:Blair"
>      >        [6,] "Labour:Hague"                             "Liberal
>      >     Democrat:Hague"
>      >        [7,] "Labour:Kennedy"                           "Liberal
>      >     Democrat:Kennedy"
>      >        [8,] "Labour:Europe"                            "Liberal
>      >     Democrat:Europe"
>      >        [9,] "Labour:political.knowledge"               "Liberal
>      >     Democrat:political.knowledge"
>      >     [10,] "Labour:gendermale"                        "Liberal
>      >     Democrat:gendermale"
>      >     [11,] "Liberal Democrat:(Intercept)"
>      >     "Conservative:(Intercept)"
>      >     [12,] "Liberal Democrat:age"                   
>       "Conservative:age"
>      >
>      >     [13,] "Liberal Democrat:economic.cond.national"
>      >     "Conservative:economic.cond.national"
>      >     [14,] "Liberal Democrat:economic.cond.household"
>      >     "Conservative:economic.cond.household"
>      >     [15,] "Liberal Democrat:Blair"                 
>       "Conservative:Blair"
>      >
>      >     [16,] "Liberal Democrat:Hague"                 
>       "Conservative:Hague"
>      >
>      >     [17,] "Liberal Democrat:Kennedy"               
>       "Conservative:Kennedy"
>      >
>      >     [18,] "Liberal Democrat:Europe"                 
>     "Conservative:Europe"
>      >
>      >     [19,] "Liberal Democrat:political.knowledge"
>      >     "Conservative:political.knowledge"
>      >     [20,] "Liberal Democrat:gendermale"
>      >     "Conservative:gendermale"
>      >
>      >       > int <- c(1, 11) # remove intercepts
>      >       > colnames(V)[int]
>      >     [1] "Labour:(Intercept)"           "Liberal Democrat:(Intercept)"
>      >
>      >       > colnames(V1)[int]
>      >     [1] "Liberal Democrat:(Intercept)" "Conservative:(Intercept)"
>      >       > V <- V[-int, -int]
>      >       > V1 <- V1[-int, -int]
>      >
>      >       > age <- c(1, 10) # locate age coefficients
>      >       > colnames(V)[age]
>      >     [1] "Labour:age"           "Liberal Democrat:age"
>      >       > colnames(V1)[age]
>      >     [1] "Liberal Democrat:age" "Conservative:age"
>      >
>      >       > V <- cov2cor(V) # compute coefficient correlations
>      >       > V1 <- cov2cor(V1)
>      >
>      >       > # compare GVIFs:
>      >       > c(det(V[age, age])*det(V[-age, -age])/det(V),
>      >     +   det(V1[age, age])*det(V1[-age, -age])/det(V1))
>      >     [1] 1.046232 1.046229
>      >
>      >     --------- snip -----------
>      >
>      >     For curiosity, I applied car::vif() and
>      >     performance::check_collinearity() to these models to see what
>     they
>      >     would
>      >     do. Both returned the wrong answer. vif() produced a warning, but
>      >     check_collinearity() didn't:
>      >
>      >     --------- snip -----------
>      >
>      >       > car::vif(m1)
>      >                           age  economic.cond.national
>      >     economic.cond.household
>      >                     15.461045               22.137772
>      >       16.693877
>      >                         Blair                   Hague
>      >       Kennedy
>      >                     14.681562                7.483039
>      >       15.812067
>      >                        Europe     political.knowledge
>      >     gender
>      >                      6.502119                4.219507
>      >     2.313885
>      >     Warning message:
>      >     In vif.default(m1) : No intercept: vifs may not be sensible.
>      >
>      >       > performance::check_collinearity(m)
>      >     # Check for Multicollinearity
>      >
>      >     Low Correlation
>      >
>      >                           Term  VIF Increased SE Tolerance
>      >                            age 1.72         1.31      0.58
>      >         economic.cond.national 1.85         1.36      0.54
>      >        economic.cond.household 1.86         1.37      0.54
>      >                          Blair 1.63         1.28      0.61
>      >                          Hague 1.94         1.39      0.52
>      >                        Kennedy 1.70         1.30      0.59
>      >                         Europe 2.01         1.42      0.50
>      >            political.knowledge 1.94         1.39      0.52
>      >                         gender 1.78         1.33      0.56
>      >       > performance::check_collinearity(m1)
>      >     # Check for Multicollinearity
>      >
>      >     Low Correlation
>      >
>      >                           Term  VIF Increased SE Tolerance
>      >                            age 1.19         1.09      0.84
>      >         economic.cond.national 1.42         1.19      0.70
>      >        economic.cond.household 1.32         1.15      0.76
>      >                          Blair 1.50         1.22      0.67
>      >                          Hague 1.30         1.14      0.77
>      >                        Kennedy 1.19         1.09      0.84
>      >                         Europe 1.34         1.16      0.75
>      >            political.knowledge 1.30         1.14      0.77
>      >                         gender 1.23         1.11      0.81
>      >
>      >     --------- snip -----------
>      >
>      >     I looked at the code for vif() and check_collinearity() to
>     see where
>      >     they went wrong. Both failed to handle the two intercepts in
>     the model
>      >     correctly -- vif() thought there was no intercept and
>      >     check_collinearity() just removed the first intercept but not the
>      >     second.
>      >
>      >     In examining the code for check_collinearity(), I discovered a
>      >     couple of
>      >     additional disconcerting facts. First, part of the code seems
>     to be
>      >     copied from vif.default(). Second, as a consequence,
>      >     check_collinearity() actually computes GVIFs rather than VIFs
>     (and
>      >     doesn't reference either the Fox and Monette paper
>     introducing GVIFs or
>      >     the car package) but doesn't seem to understand that, and,
>     for example,
>      >     takes the squareroot of the GVIF (reported in the column marked
>      >     "Increased SE") rather than the 2p root (when there are p > 1
>      >     coefficients in a term).
>      >
>      >     Here's the relevant code from the two functions (where . . .
>     denotes
>      >     elided lines) -- the default method for vif() and
>      >     .check_collinearity(),
>      >     which is called by check_collinearity.default():
>      >
>      >     --------- snip -----------
>      >
>      >       > car:::vif.default
>      >     function (mod, ...)
>      >     {
>      >           . . .
>      >           v <- vcov(mod)
>      >           assign <- attr(model.matrix(mod), "assign")
>      >           if (names(coefficients(mod)[1]) == "(Intercept)") {
>      >               v <- v[-1, -1]
>      >               assign <- assign[-1]
>      >           }
>      >           else warning("No intercept: vifs may not be sensible.")
>      >           terms <- labels(terms(mod))
>      >           n.terms <- length(terms)
>      >           if (n.terms < 2)
>      >               stop("model contains fewer than 2 terms")
>      >           R <- cov2cor(v)
>      >           detR <- det(R)
>      >           . . .
>      >           for (term in 1:n.terms) {
>      >               subs <- which(assign == term)
>      >               result[term, 1] <- det(as.matrix(R[subs, subs])) *
>      >     det(as.matrix(R[-subs,
>      >                   -subs]))/detR
>      >               result[term, 2] <- length(subs)
>      >           }
>      >           . . .
>      >     }
>      >
>      >       > performance:::.check_collinearity
>      >     function (x, component, verbose = TRUE)
>      >     {
>      >           v <- insight::get_varcov(x, component = component,
>     verbose =
>      >     FALSE)
>      >           assign <- .term_assignments(x, component, verbose =
>     verbose)
>      >           . . .
>      >           if (insight::has_intercept(x)) {
>      >               v <- v[-1, -1]
>      >               assign <- assign[-1]
>      >           }
>      >           else {
>      >               if (isTRUE(verbose)) {
>      >                   warning("Model has no intercept. VIFs may not be
>      >     sensible.",
>      >                       call. = FALSE)
>      >               }
>      >           }
>      >               . . .
>      >               terms <- labels(stats::terms(f[[component]]))
>      >               . . .
>      >           n.terms <- length(terms)
>      >           if (n.terms < 2) {
>      >               if (isTRUE(verbose)) {
>      >                   warning(insight::format_message(sprintf("Not
>     enough model
>      >     terms in the %s part of the model to check for
>     multicollinearity.",
>      >                       component)), call. = FALSE)
>      >               }
>      >               return(NULL)
>      >           }
>      >           R <- stats::cov2cor(v)
>      >           detR <- det(R)
>      >           . . .
>      >           for (term in 1:n.terms) {
>      >               subs <- which(assign == term)
>      >                   . . .
>      >                   result <- c(result, det(as.matrix(R[subs, subs])) *
>      >                       det(as.matrix(R[-subs, -subs]))/detR)
>      >                   . . .
>      >           }
>      >           . . .
>      >     }
>      >
>      >     --------- snip -----------
>      >
>      >     So, the upshot of all this is that you should be able to do
>     what you
>      >     want, but not with either car::vif() or
>      >     performance::check_collinearity(). Instead, either write your own
>      >     function or do the computations in a script.
>      >
>      >     There's also a lesson here about S3 default methods: The fact
>     that a
>      >     default method returns a result rather than throwing an error
>     or a
>      >     warning doesn't mean that the result is the right answer.
>      >
>      >     I hope this helps,
>      >        John
>      >
>      >
>      >     On 2022-02-26 3:45 p.m., Juho Kristian Ruohonen wrote:
>      >      > Dear John W,
>      >      >
>      >      > Thank you very much for the tip-off! Apologies for not
>     responding
>      >     earlier
>      >      > (gmail apparently decided to direct your email right into the
>      >     junk folder).
>      >      > I am very pleased to note that the package you mention does
>      >     indeed work
>      >      > with *brms* multinomial models! Thanks again!
>      >      >
>      >      > Best,
>      >      >
>      >      > Juho
>      >      >
>      >      > pe 25. helmik. 2022 klo 19.23 John Willoughby
>      >     (johnwillec using gmail.com <mailto:johnwillec using gmail.com>
>     <mailto:johnwillec using gmail.com <mailto:johnwillec using gmail.com>>)
>      >      > kirjoitti:
>      >      >
>      >      >> Have you tried the check_collinearity() function in the
>     performance
>      >      >> package? It's supposed to work on brms models, but whether it
>      >     will work on
>      >      >> a multinomial model I don't know.  It works well on mixed
>     models
>      >     generated
>      >      >> by glmmTMB().
>      >      >>
>      >      >> John Willoughby
>      >      >>
>      >      >>
>      >      >> On Fri, Feb 25, 2022 at 3:01 AM
>      >     <r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>
>      >     <mailto:r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>>>
>      >      >> wrote:
>      >      >>
>      >      >>> Send R-sig-mixed-models mailing list submissions to
>      >      >>> r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>
>      >      >>>
>      >      >>> To subscribe or unsubscribe via the World Wide Web, visit
>      >      >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
>      >      >>> or, via email, send a message with subject or body 'help' to
>      >      >>> r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>
>      >     <mailto:r-sig-mixed-models-request using r-project.org
>     <mailto:r-sig-mixed-models-request using r-project.org>>
>      >      >>>
>      >      >>> You can reach the person managing the list at
>      >      >>> r-sig-mixed-models-owner using r-project.org
>     <mailto:r-sig-mixed-models-owner using r-project.org>
>      >     <mailto:r-sig-mixed-models-owner using r-project.org
>     <mailto:r-sig-mixed-models-owner using r-project.org>>
>      >      >>>
>      >      >>> When replying, please edit your Subject line so it is
>     more specific
>      >      >>> than "Re: Contents of R-sig-mixed-models digest..."
>      >      >>>
>      >      >>>
>      >      >>> Today's Topics:
>      >      >>>
>      >      >>>     1. Collinearity diagnostics for (mixed) multinomial
>     models
>      >      >>>        (Juho Kristian Ruohonen)
>      >      >>>
>      >      >>>
>      >   
>       ----------------------------------------------------------------------
>      >      >>>
>      >      >>> Message: 1
>      >      >>> Date: Fri, 25 Feb 2022 10:23:25 +0200
>      >      >>> From: Juho Kristian Ruohonen
>     <juho.kristian.ruohonen using gmail.com
>     <mailto:juho.kristian.ruohonen using gmail.com>
>      >     <mailto:juho.kristian.ruohonen using gmail.com
>     <mailto:juho.kristian.ruohonen using gmail.com>>>
>      >      >>> To: John Fox <jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>
>     <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>>
>      >      >>> Cc: "r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>"
>      >      >>>          <r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>
>      >     <mailto:r-sig-mixed-models using r-project.org
>     <mailto:r-sig-mixed-models using r-project.org>>>
>      >      >>> Subject: [R-sig-ME] Collinearity diagnostics for (mixed)
>      >     multinomial
>      >      >>>          models
>      >      >>> Message-ID:
>      >      >>>          <
>      >      >>>
>      >
>     CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
>     <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>
>      >   
>       <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>>>
>      >      >>> Content-Type: text/plain; charset="utf-8"
>      >      >>>
>      >      >>> Dear John (and anyone else qualified to comment),
>      >      >>>
>      >      >>> I fit lots of mixed-effects multinomial models in my
>     research,
>      >     and I
>      >      >> would
>      >      >>> like to see some (multi)collinearity diagnostics on the
>     fixed
>      >     effects, of
>      >      >>> which there are over 30. My models are fit using the
>     Bayesian
>      >     *brms*
>      >      >>> package because I know of no frequentist packages with
>      >     multinomial GLMM
>      >      >>> compatibility.
>      >      >>>
>      >      >>> With continuous or dichotomous outcomes, my go-to
>     function for
>      >      >> calculating
>      >      >>> multicollinearity diagnostics is of course *vif()* from
>     the *car*
>      >      >> package.
>      >      >>> As expected, however, this function does not report sensible
>      >     diagnostics
>      >      >>> for multinomial models -- not even for standard ones fit
>     by the
>      >     *nnet*
>      >      >>> package's *multinom()* function. The reason, I presume, is
>      >     because a
>      >      >>> multinomial model is not really one but C-1 regression
>     models
>      >     (where C
>      >      >> is
>      >      >>> the number of response categories) and the *vif()*
>     function is not
>      >      >> designed
>      >      >>> to deal with this scenario.
>      >      >>>
>      >      >>> Therefore, in order to obtain meaningful collinearity
>     metrics,
>      >     my present
>      >      >>> plan is to write a simple helper function that uses
>     *vif() *to
>      >     calculate
>      >      >>> and present (generalized) variance inflation metrics for
>     the C-1
>      >      >>> sub-datasets to which the C-1 component binomial models
>     of the
>      >     overall
>      >      >>> multinomial model are fit. In other words, it will partition
>      >     the data
>      >      >> into
>      >      >>> those C-1 subsets, and then apply *vif()* to as many linear
>      >     regressions
>      >      >>> using a made-up continuous response and the fixed effects of
>      >     interest.
>      >      >>>
>      >      >>> Does this seem like a sensible approach?
>      >      >>>
>      >      >>> Best,
>      >      >>>
>      >      >>> Juho
>      >      >>>
>      >      >>>
>      >      >>>
>      >      >>
>      >      >>          [[alternative HTML version deleted]]
>      >      >>
>      >      >> _______________________________________________
>      >      >> R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>
>      >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>> mailing list
>      >      >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
>      >      >>
>      >      >
>      >      >       [[alternative HTML version deleted]]
>      >      >
>      >      > _______________________________________________
>      >      > R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>
>      >     <mailto:R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org>> mailing list
>      >      > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
>      >     --
>      >     John Fox, Professor Emeritus
>      >     McMaster University
>      >     Hamilton, Ontario, Canada
>      >     web: https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>
>      >     <https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>>
>      >
>     -- 
>     John Fox, Professor Emeritus
>     McMaster University
>     Hamilton, Ontario, Canada
>     web: https://socialsciences.mcmaster.ca/jfox/
>     <https://socialsciences.mcmaster.ca/jfox/>
> 
-- 
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/



More information about the R-sig-mixed-models mailing list