[R] Multivariable Wald to test equality of multinomial coefficients

Tue Oct 4 01:23:47 CEST 2016

> On 04 Oct 2016, at 00:30 , Paul Sanfilippo <prseye at gmail.com> wrote:
> 
> Hi,
> 
> I am trying to replicate a test in the Hosmer - Applied Logistic regression text (pp 289, 3rd ed) that uses a Multivariable Wald test to test the equality of coefficients across the 2 logits of a 3 category response multinomial model. I’d like to see whether (from a  statistical standpoint) it is acceptable to collapse the 2 response categories and then simply use a binary logistic regression. The idea is that if the coefficients across the 2 logits are similar (non-significant p value with Wald test), then it is reasonable to pool the categories.
> 
> There does not seem to be a built in way to do this in R?
> 
> Using the mtcars dataset as an example (for the sake of the example, using cyl as a 3-factor response), does anyone have any ideas how to do this
> 
> library(nnet)
> data(mtcars)
> mtcars$cyl <- as.factor(mtcars$cyl)  
> mtcars$am <- as.factor(mtcars$am)  
> mod <- multinom(cyl ~ am + hp, data=mtcars)
> summary(mod)
> 
>> summary(mod)
> Call:
> multinom(formula = cyl ~ am + hp, data = mtcars)
> 
> Coefficients:
>   (Intercept)       am1        hp
> 6   -42.03847  -3.77398 0.4147498
> 8   -92.30944 -26.27554 0.7836576
> 
> So, I want to simultaneously test whether the 3 coefficients across the 2 logits are similar.

R and R-packages do not always produce every single test that someone have thought up. Sometimes, you just get the tools to roll your own, and are expected to know enough theory to do so.

The generic technique for a Wald test would be to 

(a) extract the variance-covariance matrix (V) of the coefficients (beta)
(b) write the linear relation that you wish to test in matrix form A beta = 0
(c) the variance-covariance matrix of A beta will be A V A'
(d) the Wald test is (A beta)' (A V A')^{-1} (A beta)

For the case of multinom(), a little extra care is needed because it presents the coefficients as a matrix, and the variance covariance matrix is ordered as if the coefficients were organized as a vector _by row_:

> vcov(mod)
              6:(Intercept)       6:am1        6:hp 8:(Intercept)        8:am1
6:(Intercept)    771.682250  69.0782649 -7.61945359    168.212743  -463.676388
6:am1             69.078265  10.6015542 -0.71221686     13.069175   -38.850954
6:hp              -7.619454  -0.7122169  0.07550636     -1.647338     4.560144
8:(Intercept)    168.212743  13.0691754 -1.64733776   1019.860307 -1473.837691
8:am1           -463.676388 -38.8509537  4.56014436  -1473.837691  2195.306673
8:hp              -3.169803  -0.2992327  0.03147124     -7.719801    11.719345
                     8:hp
6:(Intercept) -3.16980299
6:am1         -0.29923273
6:hp           0.03147124
8:(Intercept) -7.71980097
8:am1         11.71934471
8:hp           0.06548745
> coef(mod)
  (Intercept)       am1        hp
6   -42.03847  -3.77398 0.4147498
8   -92.30944 -26.27554 0.7836576

So the thing to do would be roughly like this (the code can surely be improved):

> beta <- as.vector(t(coef(mod)))
> A  <- rbind(c(1,0,0,-1,0,0), c(0,1,0,0,-1,0), c(0,0,1,0,0,-1)) 
> A
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    0    0   -1    0    0
[2,]    0    1    0    0   -1    0
[3,]    0    0    1    0    0   -1
> A %*% beta
           [,1]
[1,] 50.2709610
[2,] 22.5015565
[3,] -0.3689079
> t(A %*% beta) %*% solve(A %*% vcov(mod) %*% t(A), A %*% beta) 
         [,1]
[1,] 3.592326

And then get the p-value from the asymptotic chi-square on 3-df

> pchisq(3.59, 3, lower=FALSE)
[1] 0.3092756

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com