[R-sig-ME] Collinearity diagnostics for (mixed) multinomial models

Fri Feb 25 20:52:55 CET 2022

Dear John (Fox, and other list members),

Could we achieve the invariance Professor Fox refers to by altering my
initial approach in the following ways:

   1. Instead of fitting a *linear* model with a mock continuous response
   to each subdataset and applying *vif() *to that, we fit to each
   subdataset an actual binary GLM with a real pairing of two response
   categories as LHS.
   2. Instead of fitting only C-1 binary sub-GLMs for which to calculate
   GVIF diagnostics, we fit *choose(C, 2) *such submodels, i.e. *one for
   every possible pairing of response categories*.
   3. Finally, for each coefficient, we average the GVIF statistic
over the *choose(C,
   2)* submodels in order to obtain a summary statistic.

What do you gentlemen think?

Best,

Juho

pe 25. helmik. 2022 klo 19.40 John Fox (jfox using mcmaster.ca) kirjoitti:

> Dear Juhu,
>
> Apologies for my slow response -- I had a busy morning.
>
> I hadn't thought about generalizing VIFs to multinomial regression
> models, and I haven't thought the question through now, but I don't
> think that what you propose makes sense.
>
> For a linear model, the vif() function in the car packages computes
> generalized VIFs, as proposed by Fox and Monette in the paper referenced
> in ?vif. That is, for the linear model y ~ 1 + X, where X is a matrix of
> regressors, the generalized VIF associated with the regression
> coefficients for a column subset of X, say X_j, is GVIF_j = det R_{jj}
> det R_{-j, -j}/det R, which is interpretable as the size (hypervolume)
> of a confidence ellipsoid for the coefficients of X_j relative to the
> size of the confidence ellipsoid for similar "utopian" data in which X_j
> and X_{-j} are uncorrelated. Here, det R_{jj} is the determinant of the
> correlation matrix among the columns of X_j, det R_{-j, -j} is the
> determinant of the correlation matrix among the remaining columns of X,
> and det R is the correlation matrix among all of the columns of X.
>
> This has the nice property that the bases for the subspaces spanned by
> the columns of X_j and X_{-j} are irrelevant, and thus, e.g., it doesn't
> matter what kind of contrasts one uses for a factor. Also when X_j is
> just one column of X, the GVIF specializes to the usual VIF.
>
> Actually, vif() uses the correlation matrix of the coefficients R_{bb}
> rather than the correlations of the variables R, but that turns out to
> be equivalent, and also suggests a generalization to other regression
> models, such as GLMs. More generally, however (that is, beyond linear
> models), the correlations of the coefficients involve y as well as X,
> and so there's some slippage in interpretation -- now the utopian data
> are no longer necessarily for uncorrelated Xs. This is true as well for
> some other diagnostics generalized beyond linear models, such as
> hatvalues, which, e.g., for GLMs, depend on the ys as well as the Xs.
>
> Analogously, in generalizing GVIFs further to a model such as a
> multinomial regression one would want a result that doesn't depend on
> the arbitrary parametrization of the LHS of the model -- for example,
> which level of the response is taken as the reference level. As I said,
> I haven't tried to think that through, but your solution isn't invariant
> in this way.
>
> I hope this helps,
>   John
>
> On 2022-02-25 3:23 a.m., Juho Kristian Ruohonen wrote:
> > Dear John (and anyone else qualified to comment),
> >
> > I fit lots of mixed-effects multinomial models in my research, and I
> > would like to see some (multi)collinearity diagnostics on the fixed
> > effects, of which there are over 30. My models are fit using the
> > Bayesian *brms* package because I know of no frequentist packages with
> > multinomial GLMM compatibility.
> >
> > With continuous or dichotomous outcomes, my go-to function for
> > calculating multicollinearity diagnostics is of course *vif()* from the
> > /car/ package. As expected, however, this function does not report
> > sensible diagnostics for multinomial models -- not even for standard
> > ones fit by the /nnet/ package's *multinom()* function. The reason, I
> > presume, is because a multinomial model is not really one but C-1
> > regression models  (where C is the number of response categories) and
> > the *vif()* function is not designed to deal with this scenario.
> >
> > Therefore, in order to obtain meaningful collinearity metrics, my
> > present plan is to write a simple helper function that uses *vif() *to
> > calculate and present (generalized) variance inflation metrics for the
> > C-1 sub-datasets to which the C-1 component binomial models of the
> > overall multinomial model are fit. In other words, it will partition the
> > data into those C-1 subsets, and then apply *vif()* to as many linear
> > regressions using a made-up continuous response and the fixed effects of
> > interest.
> >
> > Does this seem like a sensible approach?
> >
> > Best,
> >
> > Juho
> >
> >
> >
> >
> > ma 27. syysk. 2021 klo 19.26 John Fox (jfox using mcmaster.ca
> > <mailto:jfox using mcmaster.ca>) kirjoitti:
> >
> >     Dear Simon,
> >
> >     I believe that Russ's point is that the fact that the additive model
> >     allows you to estimate nonsensical quantities like a mean for girls
> in
> >     all-boys' schools implies a problem with the model. Why not do as I
> >     suggested and define two dichotomous factors: sex of student
> >     (male/female) and type of school (coed, same-sex)? The four
> >     combinations
> >     of levels then make sense.
> >
> >     Best,
> >        John
> >
> >     On 2021-09-27 12:09 p.m., Simon Harmel wrote:
> >      > Thanks, Russ! There is one thing that I still don't understand. We
> >      > have two completely empty cells (boys in girl-only & girls in
> >     boy-only
> >      > schools). Then, how are the means of those empty cells computed
> (what
> >      > data is used in their place in the additive model)?
> >      >
> >      > Let's' simplify the model for clarity:
> >      >
> >      > library(R2MLwiN)
> >      > library(emmeans)
> >      >
> >      > Form3 <- normexam ~ schgend + sex ## + standlrt + (standlrt |
> school)
> >      > model3 <- lm(Form3, data = tutorial)
> >      >
> >      > emmeans(model3, pairwise~sex+schgend)$emmeans
> >      >
> >      >   sex  schgend   emmean     SE   df lower.CL upper.CL
> >      >   boy  mixedsch -0.2160 0.0297 4055  -0.2742 -0.15780
> >      >   girl mixedsch  0.0248 0.0304 4055  -0.0348  0.08437
> >      >   boy  boysch    0.0234 0.0437 4055  -0.0623  0.10897
> >      >   girl boysch    0.2641 0.0609 4055   0.1447  0.38360<-how
> computed?
> >      >   boy  girlsch  -0.0948 0.0502 4055  -0.1931  0.00358<-how
> computed?
> >      >   girl girlsch   0.1460 0.0267 4055   0.0938  0.19829
> >      >
> >      >
> >      >
> >      >
> >      >
> >      > On Sun, Sep 26, 2021 at 8:22 PM Lenth, Russell V
> >      > <russell-lenth using uiowa.edu <mailto:russell-lenth using uiowa.edu>> wrote:
> >      >>
> >      >> By the way, returning to the topic of interpreting coefficients,
> >     you ought to have fun with the ones from the model I just fitted:
> >      >>
> >      >> Fixed effects:
> >      >>                 Estimate Std. Error t value
> >      >> (Intercept)    -0.18882    0.05135  -3.677
> >      >> standlrt        0.55442    0.01994  27.807
> >      >> schgendboysch   0.17986    0.09915   1.814
> >      >> schgendgirlsch  0.17482    0.07877   2.219
> >      >> sexgirl         0.16826    0.03382   4.975
> >      >>
> >      >> One curious thing you'll notice is that there are no
> >     coefficients for the interaction terms. Why? Because those terms
> >     were "thrown out" of the model, and so they are not shown. I think
> >     it is unwise to not show what was thrown out (e.g., lm would have
> >     shown them as NAs), because in fact what we see is but one of
> >     infinitely many possible solutions to the regression equations. This
> >     is the solution where the last two coefficients are constrained to
> >     zero. There is another equally reasonable one where the coefficients
> >     for schgendboysch and schgendgirlsch  are constrained to zero, and
> >     the two interaction effects would then be non-zero. And infinitely
> >     more where all 7 coefficients are non-zero, and there are two linear
> >     constraints among them.
> >      >>
> >      >> Of course, since the particular estimate shown consists of all
> >     the main effects and interactions are constrained to zero, it does
> >     demonstrate that the additive model *could* have been used to obtain
> >     the same estimates and standard errors, and you can see that by
> >     comparing the results (and ignoring the invalid ones from the
> >     additive model). But it is just a lucky coincidence that it worked
> >     out this way, and the additive model did lead us down a primrose
> >     path containing silly results among the correct ones.
> >      >>
> >      >> Russ
> >      >>
> >      >> -----Original Message-----
> >      >> From: Lenth, Russell V
> >      >> Sent: Sunday, September 26, 2021 7:43 PM
> >      >> To: Simon Harmel <sim.harmel using gmail.com
> >     <mailto:sim.harmel using gmail.com>>
> >      >> Cc: r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>
> >      >> Subject: RE: [External] Re: [R-sig-ME] Help with interpreting
> >     one fixed-effect coefficient
> >      >>
> >      >> I guess correctness is in the eyes of the beholder. But I think
> >     this illustrates the folly of the additive model. Having additive
> >     effects suggests a belief that you can vary one factor more or less
> >     independently of the other. In his comments, John Fox makes a good
> >     point that escaped my earlier cursory view of the original question,
> >     that you don't have data on girls attending all-boys' schools, nor
> >     boys attending all-girls' schools; yet the model that was fitted
> >     estimates a mean response for both those situations. That's a pretty
> >     clear testament to the failure of that model – and also why the
> >     coefficients don't make sense. And finally why we have estimates of
> >     15 comparisons (some of which are aliased with one another), when
> >     only 6 of them make sense.
> >      >>
> >      >> If instead, a model with interaction were fitted, it would be a
> >     rank-deficient model because two cells are empty. Perhaps there is
> >     some sort of nesting structure that could be used to work around
> >     that. However, it doesn't matter much because emmeans assesses
> >     estimability, and the two combinations I mentioned above would be
> >     flagged as non-estimable. One could then more judiciously use the
> >     contrast function to test meaningful contrasts across this irregular
> >     array of cell means. Or even injudiciously asking for all pairwise
> >     comparisons, you will see 6 estimable ones and 9 non-estimable ones.
> >     See output below.
> >      >>
> >      >> Russ
> >      >>
> >      >> ----- Interactive model -----
> >      >>
> >      >>> Form <- normexam ~ 1 + standlrt + schgend * sex + (standlrt |
> >     school)
> >      >>> model <- lmer(Form, data = tutorial, REML = FALSE)
> >      >> fixed-effect model matrix is rank deficient so dropping 2
> >     columns / coefficients
> >      >>>
> >      >>> emmeans(model, pairwise~schgend+sex)
> >      >>
> >      >> ... messages deleted ...
> >      >>
> >      >> $emmeans
> >      >>   schgend  sex    emmean     SE  df asymp.LCL asymp.UCL
> >      >>   mixedsch boy  -0.18781 0.0514 Inf   -0.2885   -0.0871
> >      >>   boysch   boy  -0.00795 0.0880 Inf   -0.1805    0.1646
> >      >>   girlsch  boy    nonEst     NA  NA        NA        NA
> >      >>   mixedsch girl -0.01955 0.0521 Inf   -0.1216    0.0825
> >      >>   boysch   girl   nonEst     NA  NA        NA        NA
> >      >>   girlsch  girl  0.15527 0.0632 Inf    0.0313    0.2792
> >      >>
> >      >> Degrees-of-freedom method: asymptotic
> >      >> Confidence level used: 0.95
> >      >>
> >      >> $contrasts
> >      >>   contrast                     estimate     SE  df z.ratio
> p.value
> >      >>   mixedsch boy - boysch boy     -0.1799 0.0991 Inf  -1.814
> 0.4565
> >      >>   mixedsch boy - girlsch boy     nonEst     NA  NA      NA
> NA
> >      >>   mixedsch boy - mixedsch girl  -0.1683 0.0338 Inf  -4.975
> <.0001
> >      >>   mixedsch boy - boysch girl     nonEst     NA  NA      NA
> NA
> >      >>   mixedsch boy - girlsch girl   -0.3431 0.0780 Inf  -4.396
> 0.0002
> >      >>   boysch boy - girlsch boy       nonEst     NA  NA      NA
> NA
> >      >>   boysch boy - mixedsch girl     0.0116 0.0997 Inf   0.116
> 1.0000
> >      >>   boysch boy - boysch girl       nonEst     NA  NA      NA
> NA
> >      >>   boysch boy - girlsch girl     -0.1632 0.1058 Inf  -1.543
> 0.6361
> >      >>   girlsch boy - mixedsch girl    nonEst     NA  NA      NA
> NA
> >      >>   girlsch boy - boysch girl      nonEst     NA  NA      NA
> NA
> >      >>   girlsch boy - girlsch girl     nonEst     NA  NA      NA
> NA
> >      >>   mixedsch girl - boysch girl    nonEst     NA  NA      NA
> NA
> >      >>   mixedsch girl - girlsch girl  -0.1748 0.0788 Inf  -2.219
> 0.2287
> >      >>   boysch girl - girlsch girl     nonEst     NA  NA      NA
> NA
> >      >>
> >      >> Degrees-of-freedom method: asymptotic
> >      >> P value adjustment: tukey method for comparing a family of 6
> >     estimates
> >      >>
> >      >>
> >      >> ---------------------------------------------------------
> >      >> From: Simon Harmel <sim.harmel using gmail.com
> >     <mailto:sim.harmel using gmail.com>>
> >      >> Sent: Sunday, September 26, 2021 3:08 PM
> >      >> To: Lenth, Russell V <russell-lenth using uiowa.edu
> >     <mailto:russell-lenth using uiowa.edu>>
> >      >> Cc: r-sig-mixed-models using r-project.org
> >     <mailto:r-sig-mixed-models using r-project.org>
> >      >> Subject: [External] Re: [R-sig-ME] Help with interpreting one
> >     fixed-effect coefficient
> >      >>
> >      >> Dear Russ and the List Members,
> >      >>
> >      >> If we use Russ' great package (emmeans), we see that although
> >     meaningless, but "schgendgirl-only" can be interpreted using the
> >     logic I mentioned here:
> >     https://stat.ethz.ch/pipermail/r-sig-mixed-models/2021q3/029723.html
> >     <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2021q3/029723.html> .
> >      >>
> >      >> That is, "schgendgirl-only" can meaninglessly mean: ***diff.
> >     bet. boys in girl-only vs. mixed schools*** just like it can
> >     meaningfully mean:  ***diff. bet. girls in girl-only vs. mixed
> >     schools***
> >      >>
> >      >> Russ, have I used emmeans correctly?
> >      >>
> >      >> Simon
> >      >>
> >      >> Here is a reproducible code:
> >      >>
> >      >> library(R2MLwiN) # For the dataset
> >      >> library(lme4)
> >      >> library(emmeans)
> >      >>
> >      >> data("tutorial")
> >      >>
> >      >> Form <- normexam ~ 1 + standlrt + schgend + sex + (standlrt |
> >     school)
> >      >> model <- lmer(Form, data = tutorial, REML = FALSE)
> >      >>
> >      >> emmeans(model, pairwise~schgend+sex)$contrast
> >      >>
> >      >> contrast                     estimate     SE  df z.ratio p.value
> >      >> mixedsch boy - boysch boy    -0.17986 0.0991 Inf -1.814  0.4565
> >      >> mixedsch boy - girlsch boy   -0.17482 0.0788 Inf -2.219  0.2287
> >       <--This coef. equals
> >      >> mixedsch boy - mixedsch girl -0.16826 0.0338 Inf -4.975  <.0001
> >      >> mixedsch boy - boysch girl   -0.34813 0.1096 Inf -3.178  0.0186
> >      >> mixedsch boy - girlsch girl  -0.34308 0.0780 Inf -4.396  0.0002
> >      >> boysch boy - girlsch boy      0.00505 0.1110 Inf  0.045  1.0000
> >      >> boysch boy - mixedsch girl    0.01160 0.0997 Inf  0.116  1.0000
> >      >> boysch boy - boysch girl     -0.16826 0.0338 Inf -4.975  <.0001
> >      >> boysch boy - girlsch girl    -0.16322 0.1058 Inf -1.543  0.6361
> >      >> girlsch boy - mixedsch girl   0.00656 0.0928 Inf  0.071  1.0000
> >      >> girlsch boy - boysch girl    -0.17331 0.1255 Inf -1.381  0.7388
> >      >> girlsch boy - girlsch girl   -0.16826 0.0338 Inf -4.975  <.0001
> >      >> mixedsch girl - boysch girl  -0.17986 0.0991 Inf -1.814  0.4565
> >      >> mixedsch girl - girlsch girl -0.17482 0.0788 Inf -2.219  0.2287
> >       <--This coef.
> >      >> boysch girl - girlsch girl    0.00505 0.1110 Inf  0.045  1.0000
> >      >>
> >      >>
> >      >
> >      > _______________________________________________
> >      > R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org> mailing list
> >      > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >      >
> >     --
> >     John Fox, Professor Emeritus
> >     McMaster University
> >     Hamilton, Ontario, Canada
> >     web: https://socialsciences.mcmaster.ca/jfox/
> >     <https://socialsciences.mcmaster.ca/jfox/>
> >
> >     _______________________________________________
> >     R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org> mailing list
> >     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
>

	[[alternative HTML version deleted]]