[R-sig-ME] Collinearity diagnostics for (mixed) multinomial models
Juho Kristian Ruohonen
juho@kr|@t|@n@ruohonen @end|ng |rom gm@||@com
Fri Feb 25 20:52:55 CET 2022
Dear John (Fox, and other list members),
Could we achieve the invariance Professor Fox refers to by altering my
initial approach in the following ways:
1. Instead of fitting a *linear* model with a mock continuous response
to each subdataset and applying *vif() *to that, we fit to each
subdataset an actual binary GLM with a real pairing of two response
categories as LHS.
2. Instead of fitting only C-1 binary sub-GLMs for which to calculate
GVIF diagnostics, we fit *choose(C, 2) *such submodels, i.e. *one for
every possible pairing of response categories*.
3. Finally, for each coefficient, we average the GVIF statistic
over the *choose(C,
2)* submodels in order to obtain a summary statistic.
What do you gentlemen think?
Best,
Juho
pe 25. helmik. 2022 klo 19.40 John Fox (jfox using mcmaster.ca) kirjoitti:
> Dear Juhu,
>
> Apologies for my slow response -- I had a busy morning.
>
> I hadn't thought about generalizing VIFs to multinomial regression
> models, and I haven't thought the question through now, but I don't
> think that what you propose makes sense.
>
> For a linear model, the vif() function in the car packages computes
> generalized VIFs, as proposed by Fox and Monette in the paper referenced
> in ?vif. That is, for the linear model y ~ 1 + X, where X is a matrix of
> regressors, the generalized VIF associated with the regression
> coefficients for a column subset of X, say X_j, is GVIF_j = det R_{jj}
> det R_{-j, -j}/det R, which is interpretable as the size (hypervolume)
> of a confidence ellipsoid for the coefficients of X_j relative to the
> size of the confidence ellipsoid for similar "utopian" data in which X_j
> and X_{-j} are uncorrelated. Here, det R_{jj} is the determinant of the
> correlation matrix among the columns of X_j, det R_{-j, -j} is the
> determinant of the correlation matrix among the remaining columns of X,
> and det R is the correlation matrix among all of the columns of X.
>
> This has the nice property that the bases for the subspaces spanned by
> the columns of X_j and X_{-j} are irrelevant, and thus, e.g., it doesn't
> matter what kind of contrasts one uses for a factor. Also when X_j is
> just one column of X, the GVIF specializes to the usual VIF.
>
> Actually, vif() uses the correlation matrix of the coefficients R_{bb}
> rather than the correlations of the variables R, but that turns out to
> be equivalent, and also suggests a generalization to other regression
> models, such as GLMs. More generally, however (that is, beyond linear
> models), the correlations of the coefficients involve y as well as X,
> and so there's some slippage in interpretation -- now the utopian data
> are no longer necessarily for uncorrelated Xs. This is true as well for
> some other diagnostics generalized beyond linear models, such as
> hatvalues, which, e.g., for GLMs, depend on the ys as well as the Xs.
>
> Analogously, in generalizing GVIFs further to a model such as a
> multinomial regression one would want a result that doesn't depend on
> the arbitrary parametrization of the LHS of the model -- for example,
> which level of the response is taken as the reference level. As I said,
> I haven't tried to think that through, but your solution isn't invariant
> in this way.
>
> I hope this helps,
> John
>
> On 2022-02-25 3:23 a.m., Juho Kristian Ruohonen wrote:
> > Dear John (and anyone else qualified to comment),
> >
> > I fit lots of mixed-effects multinomial models in my research, and I
> > would like to see some (multi)collinearity diagnostics on the fixed
> > effects, of which there are over 30. My models are fit using the
> > Bayesian *brms* package because I know of no frequentist packages with
> > multinomial GLMM compatibility.
> >
> > With continuous or dichotomous outcomes, my go-to function for
> > calculating multicollinearity diagnostics is of course *vif()* from the
> > /car/ package. As expected, however, this function does not report
> > sensible diagnostics for multinomial models -- not even for standard
> > ones fit by the /nnet/ package's *multinom()* function. The reason, I
> > presume, is because a multinomial model is not really one but C-1
> > regression models (where C is the number of response categories) and
> > the *vif()* function is not designed to deal with this scenario.
> >
> > Therefore, in order to obtain meaningful collinearity metrics, my
> > present plan is to write a simple helper function that uses *vif() *to
> > calculate and present (generalized) variance inflation metrics for the
> > C-1 sub-datasets to which the C-1 component binomial models of the
> > overall multinomial model are fit. In other words, it will partition the
> > data into those C-1 subsets, and then apply *vif()* to as many linear
> > regressions using a made-up continuous response and the fixed effects of
> > interest.
> >
> > Does this seem like a sensible approach?
> >
> > Best,
> >
> > Juho
> >
> >
> >
> >
> > ma 27. syysk. 2021 klo 19.26 John Fox (jfox using mcmaster.ca
> > <mailto:jfox using mcmaster.ca>) kirjoitti:
> >
> > Dear Simon,
> >
> > I believe that Russ's point is that the fact that the additive model
> > allows you to estimate nonsensical quantities like a mean for girls
> in
> > all-boys' schools implies a problem with the model. Why not do as I
> > suggested and define two dichotomous factors: sex of student
> > (male/female) and type of school (coed, same-sex)? The four
> > combinations
> > of levels then make sense.
> >
> > Best,
> > John
> >
> > On 2021-09-27 12:09 p.m., Simon Harmel wrote:
> > > Thanks, Russ! There is one thing that I still don't understand. We
> > > have two completely empty cells (boys in girl-only & girls in
> > boy-only
> > > schools). Then, how are the means of those empty cells computed
> (what
> > > data is used in their place in the additive model)?
> > >
> > > Let's' simplify the model for clarity:
> > >
> > > library(R2MLwiN)
> > > library(emmeans)
> > >
> > > Form3 <- normexam ~ schgend + sex ## + standlrt + (standlrt |
> school)
> > > model3 <- lm(Form3, data = tutorial)
> > >
> > > emmeans(model3, pairwise~sex+schgend)$emmeans
> > >
> > > sex schgend emmean SE df lower.CL upper.CL
> > > boy mixedsch -0.2160 0.0297 4055 -0.2742 -0.15780
> > > girl mixedsch 0.0248 0.0304 4055 -0.0348 0.08437
> > > boy boysch 0.0234 0.0437 4055 -0.0623 0.10897
> > > girl boysch 0.2641 0.0609 4055 0.1447 0.38360<-how
> computed?
> > > boy girlsch -0.0948 0.0502 4055 -0.1931 0.00358<-how
> computed?
> > > girl girlsch 0.1460 0.0267 4055 0.0938 0.19829
> > >
> > >
> > >
> > >
> > >
> > > On Sun, Sep 26, 2021 at 8:22 PM Lenth, Russell V
> > > <russell-lenth using uiowa.edu <mailto:russell-lenth using uiowa.edu>> wrote:
> > >>
> > >> By the way, returning to the topic of interpreting coefficients,
> > you ought to have fun with the ones from the model I just fitted:
> > >>
> > >> Fixed effects:
> > >> Estimate Std. Error t value
> > >> (Intercept) -0.18882 0.05135 -3.677
> > >> standlrt 0.55442 0.01994 27.807
> > >> schgendboysch 0.17986 0.09915 1.814
> > >> schgendgirlsch 0.17482 0.07877 2.219
> > >> sexgirl 0.16826 0.03382 4.975
> > >>
> > >> One curious thing you'll notice is that there are no
> > coefficients for the interaction terms. Why? Because those terms
> > were "thrown out" of the model, and so they are not shown. I think
> > it is unwise to not show what was thrown out (e.g., lm would have
> > shown them as NAs), because in fact what we see is but one of
> > infinitely many possible solutions to the regression equations. This
> > is the solution where the last two coefficients are constrained to
> > zero. There is another equally reasonable one where the coefficients
> > for schgendboysch and schgendgirlsch are constrained to zero, and
> > the two interaction effects would then be non-zero. And infinitely
> > more where all 7 coefficients are non-zero, and there are two linear
> > constraints among them.
> > >>
> > >> Of course, since the particular estimate shown consists of all
> > the main effects and interactions are constrained to zero, it does
> > demonstrate that the additive model *could* have been used to obtain
> > the same estimates and standard errors, and you can see that by
> > comparing the results (and ignoring the invalid ones from the
> > additive model). But it is just a lucky coincidence that it worked
> > out this way, and the additive model did lead us down a primrose
> > path containing silly results among the correct ones.
> > >>
> > >> Russ
> > >>
> > >> -----Original Message-----
> > >> From: Lenth, Russell V
> > >> Sent: Sunday, September 26, 2021 7:43 PM
> > >> To: Simon Harmel <sim.harmel using gmail.com
> > <mailto:sim.harmel using gmail.com>>
> > >> Cc: r-sig-mixed-models using r-project.org
> > <mailto:r-sig-mixed-models using r-project.org>
> > >> Subject: RE: [External] Re: [R-sig-ME] Help with interpreting
> > one fixed-effect coefficient
> > >>
> > >> I guess correctness is in the eyes of the beholder. But I think
> > this illustrates the folly of the additive model. Having additive
> > effects suggests a belief that you can vary one factor more or less
> > independently of the other. In his comments, John Fox makes a good
> > point that escaped my earlier cursory view of the original question,
> > that you don't have data on girls attending all-boys' schools, nor
> > boys attending all-girls' schools; yet the model that was fitted
> > estimates a mean response for both those situations. That's a pretty
> > clear testament to the failure of that model – and also why the
> > coefficients don't make sense. And finally why we have estimates of
> > 15 comparisons (some of which are aliased with one another), when
> > only 6 of them make sense.
> > >>
> > >> If instead, a model with interaction were fitted, it would be a
> > rank-deficient model because two cells are empty. Perhaps there is
> > some sort of nesting structure that could be used to work around
> > that. However, it doesn't matter much because emmeans assesses
> > estimability, and the two combinations I mentioned above would be
> > flagged as non-estimable. One could then more judiciously use the
> > contrast function to test meaningful contrasts across this irregular
> > array of cell means. Or even injudiciously asking for all pairwise
> > comparisons, you will see 6 estimable ones and 9 non-estimable ones.
> > See output below.
> > >>
> > >> Russ
> > >>
> > >> ----- Interactive model -----
> > >>
> > >>> Form <- normexam ~ 1 + standlrt + schgend * sex + (standlrt |
> > school)
> > >>> model <- lmer(Form, data = tutorial, REML = FALSE)
> > >> fixed-effect model matrix is rank deficient so dropping 2
> > columns / coefficients
> > >>>
> > >>> emmeans(model, pairwise~schgend+sex)
> > >>
> > >> ... messages deleted ...
> > >>
> > >> $emmeans
> > >> schgend sex emmean SE df asymp.LCL asymp.UCL
> > >> mixedsch boy -0.18781 0.0514 Inf -0.2885 -0.0871
> > >> boysch boy -0.00795 0.0880 Inf -0.1805 0.1646
> > >> girlsch boy nonEst NA NA NA NA
> > >> mixedsch girl -0.01955 0.0521 Inf -0.1216 0.0825
> > >> boysch girl nonEst NA NA NA NA
> > >> girlsch girl 0.15527 0.0632 Inf 0.0313 0.2792
> > >>
> > >> Degrees-of-freedom method: asymptotic
> > >> Confidence level used: 0.95
> > >>
> > >> $contrasts
> > >> contrast estimate SE df z.ratio
> p.value
> > >> mixedsch boy - boysch boy -0.1799 0.0991 Inf -1.814
> 0.4565
> > >> mixedsch boy - girlsch boy nonEst NA NA NA
> NA
> > >> mixedsch boy - mixedsch girl -0.1683 0.0338 Inf -4.975
> <.0001
> > >> mixedsch boy - boysch girl nonEst NA NA NA
> NA
> > >> mixedsch boy - girlsch girl -0.3431 0.0780 Inf -4.396
> 0.0002
> > >> boysch boy - girlsch boy nonEst NA NA NA
> NA
> > >> boysch boy - mixedsch girl 0.0116 0.0997 Inf 0.116
> 1.0000
> > >> boysch boy - boysch girl nonEst NA NA NA
> NA
> > >> boysch boy - girlsch girl -0.1632 0.1058 Inf -1.543
> 0.6361
> > >> girlsch boy - mixedsch girl nonEst NA NA NA
> NA
> > >> girlsch boy - boysch girl nonEst NA NA NA
> NA
> > >> girlsch boy - girlsch girl nonEst NA NA NA
> NA
> > >> mixedsch girl - boysch girl nonEst NA NA NA
> NA
> > >> mixedsch girl - girlsch girl -0.1748 0.0788 Inf -2.219
> 0.2287
> > >> boysch girl - girlsch girl nonEst NA NA NA
> NA
> > >>
> > >> Degrees-of-freedom method: asymptotic
> > >> P value adjustment: tukey method for comparing a family of 6
> > estimates
> > >>
> > >>
> > >> ---------------------------------------------------------
> > >> From: Simon Harmel <sim.harmel using gmail.com
> > <mailto:sim.harmel using gmail.com>>
> > >> Sent: Sunday, September 26, 2021 3:08 PM
> > >> To: Lenth, Russell V <russell-lenth using uiowa.edu
> > <mailto:russell-lenth using uiowa.edu>>
> > >> Cc: r-sig-mixed-models using r-project.org
> > <mailto:r-sig-mixed-models using r-project.org>
> > >> Subject: [External] Re: [R-sig-ME] Help with interpreting one
> > fixed-effect coefficient
> > >>
> > >> Dear Russ and the List Members,
> > >>
> > >> If we use Russ' great package (emmeans), we see that although
> > meaningless, but "schgendgirl-only" can be interpreted using the
> > logic I mentioned here:
> > https://stat.ethz.ch/pipermail/r-sig-mixed-models/2021q3/029723.html
> > <
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2021q3/029723.html> .
> > >>
> > >> That is, "schgendgirl-only" can meaninglessly mean: ***diff.
> > bet. boys in girl-only vs. mixed schools*** just like it can
> > meaningfully mean: ***diff. bet. girls in girl-only vs. mixed
> > schools***
> > >>
> > >> Russ, have I used emmeans correctly?
> > >>
> > >> Simon
> > >>
> > >> Here is a reproducible code:
> > >>
> > >> library(R2MLwiN) # For the dataset
> > >> library(lme4)
> > >> library(emmeans)
> > >>
> > >> data("tutorial")
> > >>
> > >> Form <- normexam ~ 1 + standlrt + schgend + sex + (standlrt |
> > school)
> > >> model <- lmer(Form, data = tutorial, REML = FALSE)
> > >>
> > >> emmeans(model, pairwise~schgend+sex)$contrast
> > >>
> > >> contrast estimate SE df z.ratio p.value
> > >> mixedsch boy - boysch boy -0.17986 0.0991 Inf -1.814 0.4565
> > >> mixedsch boy - girlsch boy -0.17482 0.0788 Inf -2.219 0.2287
> > <--This coef. equals
> > >> mixedsch boy - mixedsch girl -0.16826 0.0338 Inf -4.975 <.0001
> > >> mixedsch boy - boysch girl -0.34813 0.1096 Inf -3.178 0.0186
> > >> mixedsch boy - girlsch girl -0.34308 0.0780 Inf -4.396 0.0002
> > >> boysch boy - girlsch boy 0.00505 0.1110 Inf 0.045 1.0000
> > >> boysch boy - mixedsch girl 0.01160 0.0997 Inf 0.116 1.0000
> > >> boysch boy - boysch girl -0.16826 0.0338 Inf -4.975 <.0001
> > >> boysch boy - girlsch girl -0.16322 0.1058 Inf -1.543 0.6361
> > >> girlsch boy - mixedsch girl 0.00656 0.0928 Inf 0.071 1.0000
> > >> girlsch boy - boysch girl -0.17331 0.1255 Inf -1.381 0.7388
> > >> girlsch boy - girlsch girl -0.16826 0.0338 Inf -4.975 <.0001
> > >> mixedsch girl - boysch girl -0.17986 0.0991 Inf -1.814 0.4565
> > >> mixedsch girl - girlsch girl -0.17482 0.0788 Inf -2.219 0.2287
> > <--This coef.
> > >> boysch girl - girlsch girl 0.00505 0.1110 Inf 0.045 1.0000
> > >>
> > >>
> > >
> > > _______________________________________________
> > > R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org> mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > >
> > --
> > John Fox, Professor Emeritus
> > McMaster University
> > Hamilton, Ontario, Canada
> > web: https://socialsciences.mcmaster.ca/jfox/
> > <https://socialsciences.mcmaster.ca/jfox/>
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org
> > <mailto:R-sig-mixed-models using r-project.org> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list