[R] car::deltaMethod() fails when a particular combination of categorical variables is not present
John Fox
j|ox @end|ng |rom mcm@@ter@c@
Tue Sep 26 15:49:44 CEST 2023
Dear Michael,
You're testing a linear hypothesis, so there's no need to use the delta
method, but the linearHypothesis() function in the car package also
fails in your case:
> linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0")
Error in linearHypothesis.lm(minimal_model, "bt2 + csent + bt2:csent = 0") :
there are aliased coefficients in the model.
One work-around is to ravel the two factors into a single factor with 5
levels:
> df$bc <- factor(with(df, paste(b, c, sep=":")))
> df$bc
[1] t2:unsent t2:unsent t2:unsent t2:unsent t2:sent t2:unsent
[7] t2:unsent t1:sent t2:unsent t2:unsent t2:other t2:unsent
[13] t1:unsent t1:sent t2:unsent t2:other t1:unsent t2:sent
[19] t2:sent t2:unsent
Levels: t1:sent t1:unsent t2:other t2:sent t2:unsent
> m <- lm(a ~ bc, data=df)
> summary(m)
Call:
lm(formula = a ~ bc, data = df)
Residuals:
Min 1Q Median 3Q Max
-57.455 -11.750 0.439 14.011 37.545
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 20.50 17.57 1.166 0.2617
bct1:unsent 37.50 24.85 1.509 0.1521
bct2:other 32.00 24.85 1.287 0.2174
bct2:sent 17.17 22.69 0.757 0.4610
bct2:unsent 38.95 19.11 2.039 0.0595
Residual standard error: 24.85 on 15 degrees of freedom
Multiple R-squared: 0.2613, Adjusted R-squared: 0.06437
F-statistic: 1.327 on 4 and 15 DF, p-value: 0.3052
Then the hypothesis is tested directly by the t-value for the
coefficient bct2:sent.
I hope that this helps,
John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/
On 2023-09-26 1:12 a.m., Michael Cohn wrote:
> Caution: External email.
>
>
> I'm running a linear regression with two categorical predictors and their
> interaction. One combination of levels does not occur in the data, and as
> expected, no parameter is estimated for it. I now want to significance test
> a particular combination of levels that does occur in the data (ie, I want
> to get a confidence interval for the total prediction at given levels of
> each variable).
>
> In the past I've done this using car::deltaMethod() but in this dataset
> that does not work, as shown in the example below: The regression model
> gives the expected output, but deltaMethod() gives this error:
>
> error in t(gd) %*% vcov. : non-conformable arguments
>
> I believe this is because there is no parameter estimate for when the
> predictors have the values 't1' and 'other'. In the df_fixed dataframe,
> putting one person into that combination of categories causes deltaMethod()
> to work as expected.
>
> I don't know of any theoretical reason that missing one interaction
> parameter estimate should prevent getting a confidence interval for a
> different combination of predictors. Is there a way to use deltaMethod() or
> some other function to do this without changing my data?
>
> Thank you,
>
> - Michael Cohn
> Vote Rev (http://voterev.org)
>
>
> Demonstration:
> ------
>
> library(car)
> # create dataset with outcome and two categorical predictors
> outcomes <- c(91,2,60,53,38,78,48,33,97,41,64,84,64,8,66,41,52,18,57,34)
> persontype <-
> c("t2","t2","t2","t2","t2","t2","t2","t1","t2","t2","t2","t2","t1","t1","t2","t2","t1","t2","t2","t2")
> arm_letter <-
> c("unsent","unsent","unsent","unsent","sent","unsent","unsent","sent","unsent","unsent","other","unsent","unsent","sent","unsent","other","unsent","sent","sent","unsent")
> df <- data.frame(a = outcomes, b=persontype, c=arm_letter)
>
> # note: there are no records with the combination 't1' + 'other'
> table(df$b,df$c)
>
>
> #regression works as expected
> minimal_formula <- formula("a ~ b*c")
> minimal_model <- lm(minimal_formula, data=df)
> summary(minimal_model)
>
> #use deltaMethod() to get a prediction for individuals with the combination
> 'b2' and 'sent'
> # deltaMethod() fails with "error in t(gd) %*% vcov. : non-conformable
> arguments."
> deltaMethod(minimal_model, "bt2 + csent + `bt2:csent`", rhs=0)
>
> # duplicate the dataset and change one record to be in the previously empty
> cell
> df_fixed <- df
> df_fixed[c(13),"c"] <- 'other'
> table(df_fixed$b,df_fixed$c)
>
> #deltaMethod() now works
> minimal_model_fixed <- lm(minimal_formula, data=df_fixed)
> deltaMethod(minimal_model_fixed, "bt2 + csent + `bt2:csent`", rhs=0)
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list