[R] summary vs anova

peter dalgaard pdalgd at gmail.com
Mon Dec 19 16:18:52 CET 2011


On Dec 19, 2011, at 15:09 , Brent Pedersen wrote:

> Hi, I'm sure this is simple, but I haven't been able to find this in TFM,

It's not _that_ simple. You likely need TFtextbook rather than TFM. Most (but not all) will go into at least some detail of coding categorical variables using dummy variables. 

> [snip]
> 
> I understand (hopefully correctly) that anova() tests by adding each covariate
> to the model in order it is specified in the formula.
> 

Yes. Note, however, that categorical variables cause more than one dummy covariate to be added.

> More specific questions are:
> 
> 1) How do the p-values for smokes* in summary(model) relate to the
>   Pr(>F) for smokes in anova

If the last Pr(>F) corresponds to a single-df term, then F=t^2 for that term (only), and the p value is the same. If the last Pr(>F)  is for a k-df term, it corresponds to simultaneously testing that the corresponding k regression coefficients are _all_ zero;  the joint p value can not in general be calculated from tests on individual coefficients. However, they at least test related hypotheses.  

p values higher up the list in anova() test for hypotheses in models obtained after removal of subsequent factors, so are not in general comparable to the t tests in summary().

If you use drop1(...., test="F") instead of anova(), then you avoid the sequential aspect and all 1-df tests correspond to t-tests in the summary table.

> 2) what do the p-values for each of those smokes* mean exactly?

In the default parametrization, they correspond to comparisons between the stated level and the reference (first) level of the factor. In different contrast parametrizations, the interpretation will differ; the only complete advice is that you need to understand the relation between the factor levels and the rows of the design matrix.

> 3) the summary above shows the values for diseasestate1 and diseasestate2
>   how can I get the p-value for diseasecontrol? (or, e.g. genderfemale)

You can't. It would correspond to a comparison of that level with itself.

> 
> thanks.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list