[R-sig-ME] Collinearity diagnostics for (mixed) multinomial models
Juho Kristian Ruohonen
juho@kr|@t|@n@ruohonen @end|ng |rom gm@||@com
Tue Mar 1 14:24:42 CET 2022
Dear John (Fox, as well as other list members),
I've now written a simple function to try and calculate GVIFS for all
predictors in a nnet::multinom() object based on John's example code. If
its results are correct (see below), I will proceed to write a version that
also works with mixed-effects multinomial models fit by brms::brm(). Here's
the code:
gvif.multinom <- function(model){
> (classes <- model$lev)
> (V.all <- vcov(model))
> (V.noIntercepts <- V.all[!grepl("\\(Intercept\\)$", rownames(V.all),
> perl = T),
> !grepl("\\(Intercept\\)$", colnames(V.all),
> perl = T)])
> (R <- cov2cor(V.noIntercepts))
> (terms <- attr(model$terms, "term.labels"))
> (gvif <- numeric(length = length(terms)))
> (names(gvif) <- terms)
> (SE.multiplier <- numeric(length = length(terms)))
> (names(SE.multiplier) <- terms)
> #The line below tries to capture all factor levels into a regex for coef
> name matching.
> (LevelsRegex <- paste0("(", paste(unlist(model$xlevels), collapse =
> "|"),")?"))
>
> for(i in terms){
> #The regex stuff below tries to ensure all interaction coefficients
> are matched, including those involving factors.
> if(grepl(":", i)){
> (termname <- gsub(":", paste0(LevelsRegex, ":"), i, perl = T))
> }else{termname <- i}
> (RegexToMatch <- paste0("^(", paste(classes[2:length(classes)],
> collapse = "|") ,"):", termname, LevelsRegex, "$"))
>
> #Now the actual calculation:
> (indices <- grep(RegexToMatch, rownames(R), perl = T))
> (gvif[i] <- det(R[indices, indices]) * det(R[-indices, -indices]) /
> det(R))
> (SE.multiplier[i] <- gvif[i]^(1/(2*length(indices))))
> }
> #Put the results together and order them by degree of SE inflation:
> (result <- cbind(GVIF = gvif, `GVIF^(1/(2df))` = SE.multiplier))
> return(result[order(result[,"GVIF^(1/(2df))"], decreasing = T),])}
>
The results seem correct to me when applied to John's example model fit to
the BEPS data. However, that dataset contains no multi-df factors, of which
my own models have many. Below is a maximally simple example with one
multi-df factor (*region*):
mod1 <- multinom(partic ~., data = carData::Womenlf)
> gvif.multinom(mod1)
>
> GVIF GVIF^(1/(2df))
> children 1.298794 1.067542
> hincome 1.184215 1.043176
> region 1.381480 1.020403
>
These results look plausible to me. Finally, below is an example involving
both a multi-df factor and an interaction:
mod2 <- update(mod1, ~. +children:region)
> gvif.multinom(mod2)
>
> GVIF GVIF^(1/(2df))
> children:region 4.965762e+16 11.053482
> region 1.420418e+16 10.221768
> children 1.471412e+03 6.193463
> hincome 6.462161e+00 1.594390
>
These results look a bit more dubious. To be sure, it is to be expected
that interaction terms will introduce a lot of collinearity. But an 11-fold
increase in SE? I hope someone can tell me whether this is correct or not!
Best,
Juho
ti 1. maalisk. 2022 klo 0.05 John Fox (jfox using mcmaster.ca) kirjoitti:
> Dear Juha,
>
> On 2022-02-28 5:00 p.m., Juho Kristian Ruohonen wrote:
> > Apologies for my misreading, John, and many thanks for showing how the
> > calculation is done for a single term.
> >
> > Do you think *vif()* might be updated in the near future with the
> > capability of auto-detecting a multinomial model and returning
> > mathematically correct GVIF statistics?
>
> The thought crossed my mind, but I'd want to do it in a general way, not
> just for the multinom() function, and in a way that avoids incorrect
> results such as those currently produced for "multinom" models, albeit
> with a warning. I can't guarantee whether or when I'll be able to do that.
>
> John
>
> >
> > If not, I'll proceed to writing my own function based on your example.
> > However, /car/ is such an excellent and widely used package that the
> > greatest benefit to mankind would probably accrue if /car /was upgraded
> > with this feature sooner rather than later.
> >
> > Best,
> >
> > Juho
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > ma 28. helmik. 2022 klo 17.08 John Fox (jfox using mcmaster.ca
> > <mailto:jfox using mcmaster.ca>) kirjoitti:
> >
> > Dear Juho,
> >
> > On 2022-02-28 2:06 a.m., Juho Kristian Ruohonen wrote:
> > > Dear Professor Fox and other list members,
> > >
> > > Profuse thanks for doing that detective work for me! I myself
> > thought
> > > the inflation factors reported by check_collinearity() were
> > suspiciously
> > > high, but unlike you I lacked the expertise to identify what was
> > going on.
> > >
> > > As for your suggested approach, have I understood this correctly:
> > >
> > > Since there doesn't yet exist an R function that will calculate
> the
> > > (G)VIFS of multinomial models correctly, my best bet for now is
> > just to
> > > ignore the fact that such models partition the data into C-1
> > subsets,
> > > and to calculate approximate GVIFs from the entire dataset at
> > once as if
> > > the response were continuous? And a simple way to do this is to
> > > construct a fake continuous response, call *lm(fakeresponse ~.)*,
> > and
> > > apply *car::vif()* on the result?
> >
> > No, you misunderstand my suggestion, which perhaps isn't surprising
> > given the length of my message. What you propose is what I suggested
> as
> > a rough approximation *before* I confirmed that my guess of the
> > solution
> > was correct.
> >
> > The R code that I sent yesterday showed how to compute the GVIF for a
> > multinomial regression model, and I suggested that you write either a
> > script or a simple function to do that. Here's a function that will
> > work
> > for a model object that responds to vcov():
> >
> > GVIF <- function(model, intercepts, term){
> > # model: regression model object
> > # intercepts: row/column positions of intercepts in the
> coefficient
> > covariance matrix
> > # term: row/column positions of the coefficients for the focal
> term
> > V <- vcov(model)
> > term <- colnames(V)[term]
> > V <- V[-intercepts, -intercepts]
> > V <- cov2cor(V)
> > term <- which(colnames(V) %in% term)
> > gvif <- det(V[term, term])*det(V[-term, -term])/det(V)
> > c(GVIF=gvif, "GVIF^(1/(2*p))"=gvif^(1/(2*length(term))))
> > }
> >
> > and here's an application to the multinom() example that I showed you
> > yesterday:
> >
> > > colnames(vcov(m)) # to get coefficient positions
> > [1] "Labour:(Intercept)" "Labour:age"
> >
> > [3] "Labour:economic.cond.national"
> > "Labour:economic.cond.household"
> > [5] "Labour:Blair" "Labour:Hague"
> >
> > [7] "Labour:Kennedy" "Labour:Europe"
> >
> > [9] "Labour:political.knowledge" "Labour:gendermale"
> >
> > [11] "Liberal Democrat:(Intercept)" "Liberal
> Democrat:age"
> >
> > [13] "Liberal Democrat:economic.cond.national" "Liberal
> > Democrat:economic.cond.household"
> > [15] "Liberal Democrat:Blair" "Liberal
> > Democrat:Hague"
> >
> > [17] "Liberal Democrat:Kennedy" "Liberal
> > Democrat:Europe"
> > [19] "Liberal Democrat:political.knowledge" "Liberal
> > Democrat:gendermale"
> >
> > > GVIF(m, intercepts=c(1, 11), term=c(2, 12)) # GVIF for age
> > GVIF GVIF^(1/(2*p))
> > 1.046232 1.011363
> >
> >
> > Finally, here's what you get for a linear model with the same RHS
> > (where
> > the sqrt(VIF) should be a rough approximation to GVIF^(1/4) reported
> by
> > my GVIF() function):
> >
> > > m.lm <- lm(as.numeric(vote) ~ . - vote1, data=BEPS)
> > > sqrt(car::vif(m.lm))
> > age economic.cond.national
> > economic.cond.household
> > Blair
> > 1.006508 1.124132
> > 1.075656
> > 1.118441
> > Hague Kennedy
> > Europe
> > political.knowledge
> > 1.066799 1.015532
> > 1.101741
> > 1.028546
> > gender
> > 1.017386
> >
> >
> > John
> >
> > >
> > > Best,
> > >
> > > Juho
> > >
> > > ma 28. helmik. 2022 klo 2.23 John Fox (jfox using mcmaster.ca
> > <mailto:jfox using mcmaster.ca>
> > > <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>) kirjoitti:
> > >
> > > Dear Juho,
> > >
> > > I've now had a chance to think about this problem some more,
> > and I
> > > believe that the approach I suggested is correct. I also had
> an
> > > opportunity to talk the problem over a bit with Georges
> > Monette, who
> > > coauthored the paper that introduced generalized variance
> > inflation
> > > factors (GVIFs). On the other hand, the results produced by
> > > performance::check_collinearity() for multinomial logit
> > models don't
> > > seem to be correct (see below).
> > >
> > > Here's an example, using the nnet::multinom() function to fit
> a
> > > multinomial logit model, with alternative parametrizations of
> the
> > > LHS of
> > > the model:
> > >
> > > --------- snip -----------
> > >
> > > > library(nnet) # for multinom()
> > > > library(carData) # for BEPS data set
> > >
> > > > # alternative ordering of the response levels:
> > > > BEPS$vote1 <- factor(BEPS$vote, levels=c("Labour",
> "Liberal
> > > Democrat", "Conservative"))
> > > > levels(BEPS$vote)
> > > [1] "Conservative" "Labour" "Liberal Democrat"
> > > > levels(BEPS$vote1)
> > > [1] "Labour" "Liberal Democrat" "Conservative"
> > >
> > > > m <- multinom(vote ~ . - vote1, data=BEPS)
> > > # weights: 33 (20 variable)
> > > initial value 1675.383740
> > > iter 10 value 1345.935273
> > > iter 20 value 1150.956807
> > > iter 30 value 1141.921662
> > > iter 30 value 1141.921661
> > > iter 30 value 1141.921661
> > > final value 1141.921661
> > > converged
> > > > m1 <- multinom(vote1 ~ . - vote, data=BEPS)
> > > # weights: 33 (20 variable)
> > > initial value 1675.383740
> > > iter 10 value 1280.439304
> > > iter 20 value 1165.513772
> > > final value 1141.921662
> > > converged
> > >
> > > > rbind(coef(m), coef(m1)) # compare coefficients
> > > (Intercept) age
> > economic.cond.national
> > > economic.cond.household
> > > Labour 0.9515214 -0.021913989
> 0.5575707
> > > 0.15839096
> > > Liberal Democrat 1.4119306 -0.016810735
> 0.1810761
> > > -0.01196664
> > > Liberal Democrat 0.4604567 0.005102666
> -0.3764928
> > > -0.17036682
> > > Conservative -0.9514466 0.021912305
> -0.5575644
> > > -0.15838744
> > > Blair Hague Kennedy
> Europe
> > > political.knowledge
> > > Labour 0.8371764 -0.90775585 0.2513436 -0.22781308
> > > -0.5370612
> > > Liberal Democrat 0.2937331 -0.82217625 0.6710567 -0.20004624
> > > -0.2034605
> > > Liberal Democrat -0.5434408 0.08559455 0.4197027 0.02776465
> > > 0.3336068
> > > Conservative -0.8371670 0.90778068 -0.2513735 0.22781092
> > > 0.5370545
> > > gendermale
> > > Labour 0.13765774
> > > Liberal Democrat 0.12640823
> > > Liberal Democrat -0.01125898
> > > Conservative -0.13764849
> > >
> > > > c(logLik(m), logLik(m1)) # same fit to the data
> > > [1] -1141.922 -1141.922
> > >
> > > > # covariance matrices for coefficients:
> > > > V <- vcov(m)
> > > > V1 <- vcov(m1)
> > > > cbind(colnames(V), colnames(V1)) # compare
> > > [,1] [,2]
> > >
> > > [1,] "Labour:(Intercept)" "Liberal
> > > Democrat:(Intercept)"
> > > [2,] "Labour:age" "Liberal
> > > Democrat:age"
> > >
> > > [3,] "Labour:economic.cond.national" "Liberal
> > > Democrat:economic.cond.national"
> > > [4,] "Labour:economic.cond.household" "Liberal
> > > Democrat:economic.cond.household"
> > > [5,] "Labour:Blair" "Liberal
> > > Democrat:Blair"
> > > [6,] "Labour:Hague" "Liberal
> > > Democrat:Hague"
> > > [7,] "Labour:Kennedy" "Liberal
> > > Democrat:Kennedy"
> > > [8,] "Labour:Europe" "Liberal
> > > Democrat:Europe"
> > > [9,] "Labour:political.knowledge" "Liberal
> > > Democrat:political.knowledge"
> > > [10,] "Labour:gendermale" "Liberal
> > > Democrat:gendermale"
> > > [11,] "Liberal Democrat:(Intercept)"
> > > "Conservative:(Intercept)"
> > > [12,] "Liberal Democrat:age"
> > "Conservative:age"
> > >
> > > [13,] "Liberal Democrat:economic.cond.national"
> > > "Conservative:economic.cond.national"
> > > [14,] "Liberal Democrat:economic.cond.household"
> > > "Conservative:economic.cond.household"
> > > [15,] "Liberal Democrat:Blair"
> > "Conservative:Blair"
> > >
> > > [16,] "Liberal Democrat:Hague"
> > "Conservative:Hague"
> > >
> > > [17,] "Liberal Democrat:Kennedy"
> > "Conservative:Kennedy"
> > >
> > > [18,] "Liberal Democrat:Europe"
> > "Conservative:Europe"
> > >
> > > [19,] "Liberal Democrat:political.knowledge"
> > > "Conservative:political.knowledge"
> > > [20,] "Liberal Democrat:gendermale"
> > > "Conservative:gendermale"
> > >
> > > > int <- c(1, 11) # remove intercepts
> > > > colnames(V)[int]
> > > [1] "Labour:(Intercept)" "Liberal
> Democrat:(Intercept)"
> > >
> > > > colnames(V1)[int]
> > > [1] "Liberal Democrat:(Intercept)" "Conservative:(Intercept)"
> > > > V <- V[-int, -int]
> > > > V1 <- V1[-int, -int]
> > >
> > > > age <- c(1, 10) # locate age coefficients
> > > > colnames(V)[age]
> > > [1] "Labour:age" "Liberal Democrat:age"
> > > > colnames(V1)[age]
> > > [1] "Liberal Democrat:age" "Conservative:age"
> > >
> > > > V <- cov2cor(V) # compute coefficient correlations
> > > > V1 <- cov2cor(V1)
> > >
> > > > # compare GVIFs:
> > > > c(det(V[age, age])*det(V[-age, -age])/det(V),
> > > + det(V1[age, age])*det(V1[-age, -age])/det(V1))
> > > [1] 1.046232 1.046229
> > >
> > > --------- snip -----------
> > >
> > > For curiosity, I applied car::vif() and
> > > performance::check_collinearity() to these models to see what
> > they
> > > would
> > > do. Both returned the wrong answer. vif() produced a warning,
> but
> > > check_collinearity() didn't:
> > >
> > > --------- snip -----------
> > >
> > > > car::vif(m1)
> > > age economic.cond.national
> > > economic.cond.household
> > > 15.461045 22.137772
> > > 16.693877
> > > Blair Hague
> > > Kennedy
> > > 14.681562 7.483039
> > > 15.812067
> > > Europe political.knowledge
> > > gender
> > > 6.502119 4.219507
> > > 2.313885
> > > Warning message:
> > > In vif.default(m1) : No intercept: vifs may not be sensible.
> > >
> > > > performance::check_collinearity(m)
> > > # Check for Multicollinearity
> > >
> > > Low Correlation
> > >
> > > Term VIF Increased SE Tolerance
> > > age 1.72 1.31 0.58
> > > economic.cond.national 1.85 1.36 0.54
> > > economic.cond.household 1.86 1.37 0.54
> > > Blair 1.63 1.28 0.61
> > > Hague 1.94 1.39 0.52
> > > Kennedy 1.70 1.30 0.59
> > > Europe 2.01 1.42 0.50
> > > political.knowledge 1.94 1.39 0.52
> > > gender 1.78 1.33 0.56
> > > > performance::check_collinearity(m1)
> > > # Check for Multicollinearity
> > >
> > > Low Correlation
> > >
> > > Term VIF Increased SE Tolerance
> > > age 1.19 1.09 0.84
> > > economic.cond.national 1.42 1.19 0.70
> > > economic.cond.household 1.32 1.15 0.76
> > > Blair 1.50 1.22 0.67
> > > Hague 1.30 1.14 0.77
> > > Kennedy 1.19 1.09 0.84
> > > Europe 1.34 1.16 0.75
> > > political.knowledge 1.30 1.14 0.77
> > > gender 1.23 1.11 0.81
> > >
> > > --------- snip -----------
> > >
> > > I looked at the code for vif() and check_collinearity() to
> > see where
> > > they went wrong. Both failed to handle the two intercepts in
> > the model
> > > correctly -- vif() thought there was no intercept and
> > > check_collinearity() just removed the first intercept but not
> the
> > > second.
> > >
> > > In examining the code for check_collinearity(), I discovered a
> > > couple of
> > > additional disconcerting facts. First, part of the code seems
> > to be
> > > copied from vif.default(). Second, as a consequence,
> > > check_collinearity() actually computes GVIFs rather than VIFs
> > (and
> > > doesn't reference either the Fox and Monette paper
> > introducing GVIFs or
> > > the car package) but doesn't seem to understand that, and,
> > for example,
> > > takes the squareroot of the GVIF (reported in the column
> marked
> > > "Increased SE") rather than the 2p root (when there are p > 1
> > > coefficients in a term).
> > >
> > > Here's the relevant code from the two functions (where . . .
> > denotes
> > > elided lines) -- the default method for vif() and
> > > .check_collinearity(),
> > > which is called by check_collinearity.default():
> > >
> > > --------- snip -----------
> > >
> > > > car:::vif.default
> > > function (mod, ...)
> > > {
> > > . . .
> > > v <- vcov(mod)
> > > assign <- attr(model.matrix(mod), "assign")
> > > if (names(coefficients(mod)[1]) == "(Intercept)") {
> > > v <- v[-1, -1]
> > > assign <- assign[-1]
> > > }
> > > else warning("No intercept: vifs may not be sensible.")
> > > terms <- labels(terms(mod))
> > > n.terms <- length(terms)
> > > if (n.terms < 2)
> > > stop("model contains fewer than 2 terms")
> > > R <- cov2cor(v)
> > > detR <- det(R)
> > > . . .
> > > for (term in 1:n.terms) {
> > > subs <- which(assign == term)
> > > result[term, 1] <- det(as.matrix(R[subs, subs])) *
> > > det(as.matrix(R[-subs,
> > > -subs]))/detR
> > > result[term, 2] <- length(subs)
> > > }
> > > . . .
> > > }
> > >
> > > > performance:::.check_collinearity
> > > function (x, component, verbose = TRUE)
> > > {
> > > v <- insight::get_varcov(x, component = component,
> > verbose =
> > > FALSE)
> > > assign <- .term_assignments(x, component, verbose =
> > verbose)
> > > . . .
> > > if (insight::has_intercept(x)) {
> > > v <- v[-1, -1]
> > > assign <- assign[-1]
> > > }
> > > else {
> > > if (isTRUE(verbose)) {
> > > warning("Model has no intercept. VIFs may not be
> > > sensible.",
> > > call. = FALSE)
> > > }
> > > }
> > > . . .
> > > terms <- labels(stats::terms(f[[component]]))
> > > . . .
> > > n.terms <- length(terms)
> > > if (n.terms < 2) {
> > > if (isTRUE(verbose)) {
> > > warning(insight::format_message(sprintf("Not
> > enough model
> > > terms in the %s part of the model to check for
> > multicollinearity.",
> > > component)), call. = FALSE)
> > > }
> > > return(NULL)
> > > }
> > > R <- stats::cov2cor(v)
> > > detR <- det(R)
> > > . . .
> > > for (term in 1:n.terms) {
> > > subs <- which(assign == term)
> > > . . .
> > > result <- c(result, det(as.matrix(R[subs,
> subs])) *
> > > det(as.matrix(R[-subs, -subs]))/detR)
> > > . . .
> > > }
> > > . . .
> > > }
> > >
> > > --------- snip -----------
> > >
> > > So, the upshot of all this is that you should be able to do
> > what you
> > > want, but not with either car::vif() or
> > > performance::check_collinearity(). Instead, either write your
> own
> > > function or do the computations in a script.
> > >
> > > There's also a lesson here about S3 default methods: The fact
> > that a
> > > default method returns a result rather than throwing an error
> > or a
> > > warning doesn't mean that the result is the right answer.
> > >
> > > I hope this helps,
> > > John
> > >
> > >
> > > On 2022-02-26 3:45 p.m., Juho Kristian Ruohonen wrote:
> > > > Dear John W,
> > > >
> > > > Thank you very much for the tip-off! Apologies for not
> > responding
> > > earlier
> > > > (gmail apparently decided to direct your email right into
> the
> > > junk folder).
> > > > I am very pleased to note that the package you mention does
> > > indeed work
> > > > with *brms* multinomial models! Thanks again!
> > > >
> > > > Best,
> > > >
> > > > Juho
> > > >
> > > > pe 25. helmik. 2022 klo 19.23 John Willoughby
> > > (johnwillec using gmail.com <mailto:johnwillec using gmail.com>
> > <mailto:johnwillec using gmail.com <mailto:johnwillec using gmail.com>>)
> > > > kirjoitti:
> > > >
> > > >> Have you tried the check_collinearity() function in the
> > performance
> > > >> package? It's supposed to work on brms models, but
> whether it
> > > will work on
> > > >> a multinomial model I don't know. It works well on mixed
> > models
> > > generated
> > > >> by glmmTMB().
> > > >>
> > > >> John Willoughby
> > > >>
> > > >>
> > > >> On Fri, Feb 25, 2022 at 3:01 AM
> > > <r-sig-mixed-models-request using r-project.org
> > <mailto:r-sig-mixed-models-request using r-project.org>
> > > <mailto:r-sig-mixed-models-request using r-project.org
> > <mailto:r-sig-mixed-models-request using r-project.org>>>
> > > >> wrote:
> > > >>
> > > >>> Send R-sig-mixed-models mailing list submissions to
> > > >>> r-sig-mixed-models using r-project.org
> > <mailto:r-sig-mixed-models using r-project.org>
> > > <mailto:r-sig-mixed-models using r-project.org
> > <mailto:r-sig-mixed-models using r-project.org>>
> > > >>>
> > > >>> To subscribe or unsubscribe via the World Wide Web, visit
> > > >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
> > > >>> or, via email, send a message with subject or body
> 'help' to
> > > >>> r-sig-mixed-models-request using r-project.org
> > <mailto:r-sig-mixed-models-request using r-project.org>
> > > <mailto:r-sig-mixed-models-request using r-project.org
> > <mailto:r-sig-mixed-models-request using r-project.org>>
> > > >>>
> > > >>> You can reach the person managing the list at
> > > >>> r-sig-mixed-models-owner using r-project.org
> > <mailto:r-sig-mixed-models-owner using r-project.org>
> > > <mailto:r-sig-mixed-models-owner using r-project.org
> > <mailto:r-sig-mixed-models-owner using r-project.org>>
> > > >>>
> > > >>> When replying, please edit your Subject line so it is
> > more specific
> > > >>> than "Re: Contents of R-sig-mixed-models digest..."
> > > >>>
> > > >>>
> > > >>> Today's Topics:
> > > >>>
> > > >>> 1. Collinearity diagnostics for (mixed) multinomial
> > models
> > > >>> (Juho Kristian Ruohonen)
> > > >>>
> > > >>>
> > >
> >
> ----------------------------------------------------------------------
> > > >>>
> > > >>> Message: 1
> > > >>> Date: Fri, 25 Feb 2022 10:23:25 +0200
> > > >>> From: Juho Kristian Ruohonen
> > <juho.kristian.ruohonen using gmail.com
> > <mailto:juho.kristian.ruohonen using gmail.com>
> > > <mailto:juho.kristian.ruohonen using gmail.com
> > <mailto:juho.kristian.ruohonen using gmail.com>>>
> > > >>> To: John Fox <jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>
> > <mailto:jfox using mcmaster.ca <mailto:jfox using mcmaster.ca>>>
> > > >>> Cc: "r-sig-mixed-models using r-project.org
> > <mailto:r-sig-mixed-models using r-project.org>
> > > <mailto:r-sig-mixed-models using r-project.org
> > <mailto:r-sig-mixed-models using r-project.org>>"
> > > >>> <r-sig-mixed-models using r-project.org
> > <mailto:r-sig-mixed-models using r-project.org>
> > > <mailto:r-sig-mixed-models using r-project.org
> > <mailto:r-sig-mixed-models using r-project.org>>>
> > > >>> Subject: [R-sig-ME] Collinearity diagnostics for (mixed)
> > > multinomial
> > > >>> models
> > > >>> Message-ID:
> > > >>> <
> > > >>>
> > >
> > CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
> > <mailto:
> CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com>
> > >
> > <mailto:
> CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
> <mailto:CAG_dBVfZr1-P7Q3kbE8TGPm-_2sJixdGCHCtWM9Q9PEnd8ftZw using mail.gmail.com
> >>>
> > > >>> Content-Type: text/plain; charset="utf-8"
> > > >>>
> > > >>> Dear John (and anyone else qualified to comment),
> > > >>>
> > > >>> I fit lots of mixed-effects multinomial models in my
> > research,
> > > and I
> > > >> would
> > > >>> like to see some (multi)collinearity diagnostics on the
> > fixed
> > > effects, of
> > > >>> which there are over 30. My models are fit using the
> > Bayesian
> > > *brms*
> > > >>> package because I know of no frequentist packages with
> > > multinomial GLMM
> > > >>> compatibility.
> > > >>>
> > > >>> With continuous or dichotomous outcomes, my go-to
> > function for
> > > >> calculating
> > > >>> multicollinearity diagnostics is of course *vif()* from
> > the *car*
> > > >> package.
> > > >>> As expected, however, this function does not report
> sensible
> > > diagnostics
> > > >>> for multinomial models -- not even for standard ones fit
> > by the
> > > *nnet*
> > > >>> package's *multinom()* function. The reason, I presume,
> is
> > > because a
> > > >>> multinomial model is not really one but C-1 regression
> > models
> > > (where C
> > > >> is
> > > >>> the number of response categories) and the *vif()*
> > function is not
> > > >> designed
> > > >>> to deal with this scenario.
> > > >>>
> > > >>> Therefore, in order to obtain meaningful collinearity
> > metrics,
> > > my present
> > > >>> plan is to write a simple helper function that uses
> > *vif() *to
> > > calculate
> > > >>> and present (generalized) variance inflation metrics for
> > the C-1
> > > >>> sub-datasets to which the C-1 component binomial models
> > of the
> > > overall
> > > >>> multinomial model are fit. In other words, it will
> partition
> > > the data
> > > >> into
> > > >>> those C-1 subsets, and then apply *vif()* to as many
> linear
> > > regressions
> > > >>> using a made-up continuous response and the fixed
> effects of
> > > interest.
> > > >>>
> > > >>> Does this seem like a sensible approach?
> > > >>>
> > > >>> Best,
> > > >>>
> > > >>> Juho
> > > >>>
> > > >>>
> > > >>>
> > > >>
> > > >> [[alternative HTML version deleted]]
> > > >>
> > > >> _______________________________________________
> > > >> R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org>
> > > <mailto:R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org>> mailing list
> > > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
> > > >>
> > > >
> > > > [[alternative HTML version deleted]]
> > > >
> > > > _______________________________________________
> > > > R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org>
> > > <mailto:R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org>> mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>>
> > > --
> > > John Fox, Professor Emeritus
> > > McMaster University
> > > Hamilton, Ontario, Canada
> > > web: https://socialsciences.mcmaster.ca/jfox/
> > <https://socialsciences.mcmaster.ca/jfox/>
> > > <https://socialsciences.mcmaster.ca/jfox/
> > <https://socialsciences.mcmaster.ca/jfox/>>
> > >
> > --
> > John Fox, Professor Emeritus
> > McMaster University
> > Hamilton, Ontario, Canada
> > web: https://socialsciences.mcmaster.ca/jfox/
> > <https://socialsciences.mcmaster.ca/jfox/>
> >
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list