[R] summary vs anova
peter dalgaard
pdalgd at gmail.com
Mon Dec 19 16:18:52 CET 2011
On Dec 19, 2011, at 15:09 , Brent Pedersen wrote:
> Hi, I'm sure this is simple, but I haven't been able to find this in TFM,
It's not _that_ simple. You likely need TFtextbook rather than TFM. Most (but not all) will go into at least some detail of coding categorical variables using dummy variables.
> [snip]
>
> I understand (hopefully correctly) that anova() tests by adding each covariate
> to the model in order it is specified in the formula.
>
Yes. Note, however, that categorical variables cause more than one dummy covariate to be added.
> More specific questions are:
>
> 1) How do the p-values for smokes* in summary(model) relate to the
> Pr(>F) for smokes in anova
If the last Pr(>F) corresponds to a single-df term, then F=t^2 for that term (only), and the p value is the same. If the last Pr(>F) is for a k-df term, it corresponds to simultaneously testing that the corresponding k regression coefficients are _all_ zero; the joint p value can not in general be calculated from tests on individual coefficients. However, they at least test related hypotheses.
p values higher up the list in anova() test for hypotheses in models obtained after removal of subsequent factors, so are not in general comparable to the t tests in summary().
If you use drop1(...., test="F") instead of anova(), then you avoid the sequential aspect and all 1-df tests correspond to t-tests in the summary table.
> 2) what do the p-values for each of those smokes* mean exactly?
In the default parametrization, they correspond to comparisons between the stated level and the reference (first) level of the factor. In different contrast parametrizations, the interpretation will differ; the only complete advice is that you need to understand the relation between the factor levels and the rows of the design matrix.
> 3) the summary above shows the values for diseasestate1 and diseasestate2
> how can I get the p-value for diseasecontrol? (or, e.g. genderfemale)
You can't. It would correspond to a comparison of that level with itself.
>
> thanks.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list