[R] Print hypothesis warning- Car package
John Fox
j|ox @end|ng |rom mcm@@ter@c@
Mon Sep 18 17:50:32 CEST 2023
Hi Peter,
On 2023-09-18 10:08 a.m., peter dalgaard wrote:
> Caution: External email.
>
>
> Also, I would guess that the code precedes the use of backticks in non-syntactic names.
Indeed, by more than a decade (though modified in the interim).
> Could they be deployed here?
I don't think so, at least not without changing how the function works.
The problem doesn't occur when the hypothesis is specified symbolically
as a character vector, including in equation form, only when the
hypothesis matrix is given directly, in which case linearHypothesis()
tries to construct the equation-form representation, again as character
vectors. Its inability to do so when the coefficient names include
arithmetic operators doesn't, I think, require a warning or even a
message: the symbolic representation of the hypothesis can simply be
omitted. The numeric results reported are entirely unaffected.
I've made this change and will commit it to the next version of the car
package.
Thank you for the suggestion,
John
>
> - Peter
>
>> On 17 Sep 2023, at 16:43 , John Fox <jfox using mcmaster.ca> wrote:
>>
>> Dear Robert,
>>
>> Anova() calls linearHypothesis(), also in the car package, to compute sums of squares and df, supplying appropriate hypothesis matrices. linearHypothesis() usually tries to express the hypothesis matrix in symbolic equation form for printing, but won't do this if coefficient names include arithmetic operators, in your case - and +, which can confuse it.
>>
>> The symbolic form of the hypothesis isn't really relevant for Anova(), which doesn't use the printed representation of each hypothesis, and so, despite the warnings, you get the correct ANOVA table. In your case, where the data are balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, which makes me wonder why you would use Anova() in the first place.
>>
>> To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in coefficient names if you specify the hypothesis symbolically rather than as a hypothesis matrix. For example, to test, the interaction:
>>
>> ------- snip --------
>>
>>> linearHypothesis(mod,
>> + c("TreatmentDabrafenib:ExpressionCD271+ = 0",
>> + "TreatmentTrametinib:ExpressionCD271+ = 0",
>> + "TreatmentCombination:ExpressionCD271+ = 0"))
>> Linear hypothesis test
>>
>> Hypothesis:
>> TreatmentDabrafenib:ExpressionCD271+ = 0
>> TreatmentTrametinib:ExpressionCD271+ = 0
>> TreatmentCombination:ExpressionCD271+ = 0
>>
>> Model 1: restricted model
>> Model 2: Viability ~ Treatment * Expression
>>
>> Res.Df RSS Df Sum of Sq F Pr(>F)
>> 1 27 18966
>> 2 24 16739 3 2226.3 1.064 0.3828
>>
>> ------- snip --------
>>
>> Alternatively:
>>
>> ------- snip --------
>>
>>> H <- matrix(0, 3, 8)
>>> H[1, 6] <- H[2, 7] <- H[3, 8] <- 1
>>> H
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
>> [1,] 0 0 0 0 0 1 0 0
>> [2,] 0 0 0 0 0 0 1 0
>> [3,] 0 0 0 0 0 0 0 1
>>
>>> linearHypothesis(mod, H)
>> Linear hypothesis test
>>
>> Hypothesis:
>>
>>
>> Model 1: restricted model
>> Model 2: Viability ~ Treatment * Expression
>>
>> Res.Df RSS Df Sum of Sq F Pr(>F)
>> 1 27 18966
>> 2 24 16739 3 2226.3 1.064 0.3828
>> Warning message:
>> In printHypothesis(L, rhs, names(b)) :
>> one or more coefficients in the hypothesis include
>> arithmetic operators in their names;
>> the printed representation of the hypothesis will be omitted
>>
>> ------- snip --------
>>
>> There's no good reason that linearHypothesis() should try to express each hypothesis symbolically for Anova(), since Anova() doesn't use that information. When I have some time, I'll arrange to avoid the warning.
>>
>> Best,
>> John
>>
>> --
>> John Fox, Professor Emeritus
>> McMaster University
>> Hamilton, Ontario, Canada
>> web: https://www.john-fox.ca/
>> On 2023-09-16 4:39 p.m., Robert Baer wrote:
>>> Caution: External email.
>>> When doing Anova using the car package, I get a print warning that is
>>> unexpected. It seemingly involves have my flow cytometry factor levels
>>> named CD271+ and CD171-. But I am not sure this warning should be
>>> intended behavior. Any explanation about whether I'm doing something
>>> wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal
>>> text isn't it?
>>> library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1)
>>> Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum
>>> Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression
>>> 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640
>>> 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ‘***’ 0.001 ‘**’
>>> 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In printHypothesis(L,
>>> rhs, names(b)) : one or more coefficients in the hypothesis include
>>> arithmetic operators in their names; the printed representation of the
>>> hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one
>>> or more coefficients in the hypothesis include arithmetic operators in
>>> their names; the printed representation of the hypothesis will be
>>> omitted 3: In printHypothesis(L, rhs, names(b)) : one or more
>>> coefficients in the hypothesis include arithmetic operators in their
>>> names; the printed representation of the hypothesis will be omitted
>>> The code to reproduce:
>>> ```
>>> dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L,
>>> 1L, 1L, 1L, 2L, 2L, 2L,
>>> 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
>>> 3L, 3L, 4L, 4L, 4L, 4L,
>>> 4L, 4L, 4L, 4L), levels = c("Control",
>>> "Dabrafenib", "Trametinib", "Combination"), class = "factor"),
>>> Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L,
>>> 1L, 1L,
>>> 1L, 2L, 2L, 2L, 2L, 1L,
>>> 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
>>> 1L, 2L, 2L, 2L, 2L, 1L,
>>> 1L, 1L, 1L), levels = c("CD271-",
>>> "CD271+"), class = "factor"),
>>> Viability = c(128.329809725159, 24.2360176821065,
>>> 76.3597924274457, 11.0128771862387, 21.4683836248318,
>>> 140.784162982894, 87.4303286565443,
>>> 118.181818181818, 53.603690178743,
>>> 51.2973284643475, 5.47760907168941,
>>> 27.1574091870075, 50.8360561214684,
>>> 56.5250816836441, 28.6949836632712,
>>> 93.2731116663463, 71.900826446281,
>>> 32.2314049586777, 24.2360176821065,
>>> 27.4649240822602, 24.0822602344801,
>>> 26.542379396502, 30.693830482414,
>>> 27.772438977513, 13.4729963482606,
>>> 8.24524312896406, 18.5469921199308,
>>> 13.9342686911397, 13.3192389006342,
>>> 19.9308091485681, 17.6244474341726,
>>> 16.2406304055353)),
>>> row.names = c(NA,
>>> -32L),
>>> class = c("tbl_df", "tbl", "data.frame"))
>>> mod = aov(Viability ~ Treatment*Expression, data = dat1)
>>> summary(mod)
>>> library(car)
>>> Anova(mod, type =2)
>>> ```
>>>> sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform:
>>> x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build
>>> 25951) Matrix products: default locale: [1] LC_COLLATE=English_United
>>> States.utf8 LC_CTYPE=English_United States.utf8
>>> LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C
>>> LC_TIME=English_United States.utf8 time zone: America/Chicago tzcode
>>> source: internal attached base packages: [1] stats graphics grDevices
>>> utils datasets methods base other attached packages: [1] car_3.1-2
>>> carData_3.0-5 tidyr_1.3.0 readr_2.1.4 readxl_1.4.3 ggplot2_3.4.3
>>> dplyr_1.1.3 loaded via a namespace (and not attached): [1] crayon_1.5.2
>>> vctrs_0.6.3 cli_3.6.1 rlang_1.1.1 purrr_1.0.2 generics_0.1.3
>>> labeling_0.4.3 [8] bit_4.0.5 glue_1.6.2 colorspace_2.1-0 hms_1.1.3
>>> scales_1.2.1 fansi_1.0.4 grid_4.3.1 [15] cellranger_1.1.0 abind_1.4-5
>>> munsell_0.5.0 tibble_3.2.1 tzdb_0.4.0 lifecycle_1.0.3 compiler_4.3.1
>>> [22] pkgconfig_2.0.3 rstudioapi_0.15.0 farver_2.1.1 R6_2.5.1
>>> tidyselect_1.2.0 utf8_1.2.3 parallel_4.3.1 [29] vroom_1.6.3 pillar_1.9.0
>>> magrittr_2.0.3 bit64_4.0.5 tools_4.3.1 withr_2.5.0 gtable_0.3.4
>>> [[alternative HTML version deleted]]
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
>
More information about the R-help
mailing list