# [BioC] Linear Models and ANOVA

James W. MacDonald jmacdon at med.umich.edu
Fri Dec 17 16:12:26 CET 2010

```Hi Thomas,

On 12/16/2010 4:03 PM, Thomas Hampton wrote:
> This is an off topic question related more to R and statistics, but I
> will impose myself, if you don't mind.

You are correct. This has nothing to do with Bioconductor, nor even the
analysis of high-throughput data. You would be better served by asking
on R-help, although you might need a fairly thick skin, depending on who
replies, as this isn't really a question about R.

Alternatively, you could do some reading on your own to see why the
output is different. See

?anova.lm
?summary.lm

which should clear up the confusion for you.

If that doesn't help, Julian Faraway has an excellent book that covers
linear models in R. If you are lucky, you might even be able to find the
pdf of that book somewhere out on the intertubes, as it was freely
available in the past before he published.

Best,

Jim

>
> Here is my issue.
>
> R anova is essentially a way to interpret some linear model such as
>
> fit <- lm(y ~a*b)
>
> You can generate nice p values by doing something like
>
> anova(lm(y ~a*b))
>
> But you could also generate p values like this:
>
> summary(lm(y~a*b))
>
> I find though, that the p values you generate may be different
> depending on whether you call summary.lm or whether
> you get them from anova.lm.
>
> For example:
>  > summary(lm(formula = Alertness ~ Gender * Dosage, data = data.ex2))
>
> Call:
> lm(formula = Alertness ~ Gender * Dosage, data = data.ex2)
>
> Residuals:
> Min 1Q Median 3Q Max
> -6.500 -3.375 0.000 1.562 10.500
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 15.750 2.546 6.185 4.69e-05 ***
> Genderm -4.500 3.601 -1.250 0.235
> Dosageb 1.000 3.601 0.278 0.786
> Genderm:Dosageb 0.250 5.093 0.049 0.962
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 5.093 on 12 degrees of freedom
> Multiple R-squared: 0.2079, Adjusted R-squared: 0.009862
> F-statistic: 1.05 on 3 and 12 DF, p-value: 0.4062
>
>  > anova(lm(formula = Alertness ~ Gender * Dosage, data = data.ex2))
> Analysis of Variance Table
>
> Df Sum Sq Mean Sq F value Pr(>F)
> Gender 1 76.562 76.562 2.9518 0.1115
> Dosage 1 5.062 5.062 0.1952 0.6665
> Gender:Dosage 1 0.063 0.063 0.0024 0.9617
> Residuals 12 311.250 25.938
>
>
> The anova output is tidier to look at. But why are the anova p values
> smaller
> for Gender and Dosage?
>
>
>
> Tom
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

```