# [R] Interpreting summary.lm for a 2 factor anova

Richard M. Heiberger rmh at temple.edu
Sun Dec 4 06:29:29 CET 2016

```As Petr Pikal mentioned, the difficulty in interpretation is entirely due
to the set of contrasts you chose.The default treatment contrasts are
not orthogonal and are therefore the most difficult to interpret.
The note in ?aov warns of this difficulty.

sum contrasts will give you numbers that are easiest to interpret.

options(contrasts = c("contr.sum", "contr.poly"))
warpbreakssum.aov <- aov(breaks ~ wool * tension, data = warpbreaks)
coef(warpbreakssum.aov)
model.tables(warpbreakstreatment.aov, type="effects")
model.tables(warpbreakstreatment.aov, type="means")

John Fox showed the algebra using the default treatment contrasts

For full understanding you will need to read  in a text more about
sets of linear contrasts and their algebra.
I recommend Section 10.3 in mine, of course.

Statistical Analysis and Data Display:
An Intermediate Course with Examples in R
Heiberger, Richard M., Holland, Burt

http://www.springer.com/us/book/9781493921218

On Sat, Dec 3, 2016 at 11:46 PM, Ashim Kapoor <ashimkapoor at gmail.com> wrote:
> On Sun, Dec 4, 2016 at 10:03 AM, Ashim Kapoor <ashimkapoor at gmail.com> wrote:
>
>> Dear Sir,
>>
>> Many thanks for the explanation. Prior to your email (with some help from
>> a friend of mine) I was able to figure this one out. If we look at the
>> model : -
>>
>> y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4. woolB.TensionM
>> + B5.woolB.TensionH + error
>>
>> Here woolB, tensionM, tensionH are the dummy indicator variables similar
>> to how you have defined them.
>>
>> Now suppose we consider y1,..,yn, all in group A.L (say).
>>
>> Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 + 0 +
>> 0 + 0 + 0.
>>
>> This should be : y1 + ... yn = n . intercept
>
> What was confusing me was how to compute the cell mean in woolB,tensionH
>> cell.
>>
>> If we have y_1,...,y_n all in group B.H then :-
>>
>> y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 +  B5
>>
>> This should be : y_1 + ... +y_n = n( intercept + B1 + 0 + B3 + 0 +  B5 )
>
>
>> Therefore average of group B.H = intercept + B1 + B3 + B5
>>
>> Many thanks and Best Regards,
>> Ashim
>>
>>
>>
>> On Sat, Dec 3, 2016 at 7:15 PM, Fox, John <jfox at mcmaster.ca> wrote:
>>
>>> Dear Ashim,
>>>
>>> Sorry to chime in late, and my apologies if someone has already pointed
>>> this out, but here's the relationship between the cell means and the model
>>> coefficients, using the row-basis of the model matrix:
>>>
>>> -------------------------- snip ------------------------
>>>
>>> > means <- with( warpbreaks, tapply( breaks, interaction(wool, tension),
>>> mean ) )
>>> > x.A <- rep(c(0, 1), 3)
>>> > x.B1 <- rep(c(0, 1, 0), each=2)
>>> > x.B2 <- rep(c(0, 0, 1), each=2)
>>> > x.AB1 <- x.A*x.B1
>>> > x.AB2 <- x.A*x.B2
>>> > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2)
>>> > X.basis
>>>        x.A x.B1 x.B2 x.AB1 x.AB2
>>> [1,] 1   0    0    0     0     0
>>> [2,] 1   1    0    0     0     0
>>> [3,] 1   0    1    0     0     0
>>> [4,] 1   1    1    0     1     0
>>> [5,] 1   0    0    1     0     0
>>> [6,] 1   1    0    1     0     1
>>> > solve(X.basis, means)
>>>                 x.A      x.B1      x.B2     x.AB1     x.AB2
>>>  44.55556 -16.33333 -20.55556 -20.00000  21.11111  10.55556
>>> > coef(aov(breaks ~ wool * tension, data = warpbreaks))
>>>    (Intercept)          woolB       tensionM       tensionH woolB:tensionM
>>>       44.55556      -16.33333      -20.55556      -20.00000       21.11111
>>> woolB:tensionH
>>>       10.55556
>>>
>>> -------------------------- snip ------------------------
>>>
>>> I hope this helps,
>>>  John
>>>
>>> -----------------------------
>>> John Fox, Professor
>>> McMaster University
>>> Hamilton, Ontario
>>> Web: socserv.mcmaster.ca/jfox
>>>
>>>
>>>
>>> > -----Original Message-----
>>> > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashim
>>> Kapoor
>>> > Sent: December 3, 2016 12:19 AM
>>> > To: David Winsemius <dwinsemius at comcast.net>
>>> > Cc: r-help at r-project.org
>>> > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova
>>> >
>>> > Please allow me to rephrase myquery.
>>> >
>>> > > model.tables(model,"m")
>>> > Tables of means
>>> > Grand mean
>>> >
>>> > 28.14815
>>> >
>>> >  wool
>>> > wool
>>> >      A      B
>>> > 31.037 25.259
>>> >
>>> >  tension
>>> > tension
>>> >     L     M     H
>>> > 36.39 26.39 21.67
>>> >
>>> >  wool:tension
>>> >     tension
>>> > wool L     M     H
>>> >    A 44.56 24.00 24.56
>>> >    B 28.22 28.78 18.78
>>> > >
>>> >
>>> >
>>> > The above is the same as :
>>> >
>>> > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
>>> >      A.L      B.L      A.M      B.M      A.H      B.H
>>> > 44.55556 28.22222 24.00000 28.77778 24.55556 18.77778
>>> >
>>> > For reference:
>>> >
>>> > > model <- aov(breaks ~ wool * tension, data = warpbreaks)
>>> > > summary.lm(model)
>>> >
>>> > Call:
>>> > aov(formula = breaks ~ wool * tension, data = warpbreaks)
>>> >
>>> > Residuals:
>>> >      Min       1Q   Median       3Q      Max
>>> > -19.5556  -6.8889  -0.6667   7.1944  25.4444
>>> >
>>> > Coefficients:
>>> >                Estimate Std. Error t value Pr(>|t|)
>>> > (Intercept)      44.556      3.647  12.218 2.43e-16 ***
>>> > woolB           -16.333      5.157  -3.167 0.002677 **
>>> > tensionM        -20.556      5.157  -3.986 0.000228 ***
>>> > tensionH        -20.000      5.157  -3.878 0.000320 ***
>>> > woolB:tensionM   21.111      7.294   2.895 0.005698 **
>>> > woolB:tensionH   10.556      7.294   1.447 0.154327
>>> > ---
>>> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>> >
>>> > Residual standard error: 10.94 on 48 degrees of freedom
>>> > Multiple R-squared:  0.3778,    Adjusted R-squared:  0.3129
>>> > F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772
>>> >
>>> >
>>> > Now I'll explain what is confusing me in the output of summary.lm.
>>> >
>>> > Coeff of Intercept = 44.556  = cell mean for A.L. This is the base.
>>> >
>>> > Coeff of woolB:L = -16.333 = 28.22222 - 44.556. This is the difference
>>> of this
>>> > cell mean(B:L) from the base.
>>> >
>>> > Coeff of woolA:tensionM = -20.556  = 24.000- 44.556. This is the
>>> difference of
>>> > this cell mean (A:M)  from the base.
>>> >
>>> > Coeff of woolA:tensionH = -20.000  = 24.55556 - 44.556. This is the
>>> difference
>>> > of this cell mean(A:H) from the base.
>>> >
>>> > This is where it stops being the difference from the base.
>>> >
>>> > Coeff of woolB:tensionM = 21.111 should turn out to be 28.77778 -
>>> 44.556 but
>>> > this is -15.77822
>>> >
>>> > Coeff of woolB:tensionH = 10.556 should turn out to be  18.77778 -
>>> 44.556 but
>>> > this is -25.77822
>>> >
>>> > In the above 2 cases, we can't say that the coefficient = cell mean -
>>> base case.
>>> > Can you tell me what should be the statement to be made ?
>>> >
>>> >
>>> > Best Regards,
>>> > Ashim
>>> >
>>> > PS : My apologies for emailing my query to this list. Can you tell me
>>> the names
>>> > of a few (active) statistics help list ?
>>> >
>>> > On Sat, Dec 3, 2016 at 1:33 AM, David Winsemius <dwinsemius at comcast.net
>>> >
>>> > wrote:
>>> >
>>> > >
>>> > > > On Dec 2, 2016, at 9:09 AM, David Winsemius <dwinsemius at comcast.net
>>> >
>>> > > wrote:
>>> > > >
>>> > > >>
>>> > > >> On Dec 2, 2016, at 6:16 AM, Ashim Kapoor <ashimkapoor at gmail.com>
>>> > wrote:
>>> > > >>
>>> > > >> Dear Pikal,
>>> > > >>
>>> > > >> All levels except the interactions are compared to the Intercept.
>>> > > >> I'm a little confused as to what's going on in interaction terms
>>> > > >> eg. the cell wool B : tension M. It's mean is :
>>> > > >> 28.78 and 28.78 - 44.56 = -15.78 != 21.111.
>>> > > >>
>>> > > >> It's something like 44.56 (intercept) -16.333 (wool B) -.20.556
>>> > > >> (tension
>>> > > >> M)  + 21.111 (woolB:tensionM) = 28.782.
>>> > > >>
>>> > > >> I don't know how to sum up the above line in terms of differences
>>> > > >> succinctly.
>>> > > >
>>> > > > The aov estimate will not exactly equal the observed mean (this is
>>> > > _statistics_ after all). You should be comparing the mean of that cell
>>> > > to the estimate:
>>> > > >
>>> > > > 44.556 + (-16.33) +(-20.556) + (21.11)
>>> > >
>>> > > A respected participant advised me to look at this more closely. In
>>> > > this case (and I think in most such cases)  where there are the same
>>> > > number of parameters as there are means, the model is "saturated" and
>>> > > there is no
>>> > > difference:
>>> > >
>>> > >  with( warpbreaks, tapply( breaks, interaction(wool, tension), mean )
>>> )
>>> > >      A.L      B.L      A.M      B.M      A.H      B.H
>>> > > 44.55556 28.22222 24.00000 28.77778 24.55556 18.77778
>>> > >
>>> > > So the B:M estimate is identical up to rounding with the observed
>>> mean:
>>> > >
>>> > >  44.556 + (-16.33) +(-20.556) + (21.11) [1] 28.78
>>> > >
>>> > >
>>> > >
>>> > > >
>>> > > > The difference between the observed mean and the estimated mean is
>>> > known
>>> > > as a 'residual'
>>> > >
>>> > > I've also been privately but gently chided for this misstatement.
>>> > > Residuals are the difference between data and estimates.
>>> > >
>>> > > > and the squared sum of the all residuals is what this being
>>> minimized
>>> > > ... over all the cells including the one implicitly associated with
>>> the
>>> > > Intercept.
>>> > > >
>>> > > > This isn't really on-topic for Rhelp since you are not having
>>> difficulty
>>> > > in getting the R program to perform its duties, but are rather in
>>> need of
>>> > > statistical education. That not what this mailing list is set up for.
>>> > > >
>>> > > > --
>>> > > > David.
>>> > > >
>>> > > >>
>>> > > >>>
>>> > > >>>> -----Original Message-----
>>> > > >>>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
>>> Ashim
>>> > > >>>> Kapoor
>>> > > >>>> Sent: Thursday, December 1, 2016 2:48 PM
>>> > > >>>> To: r-help at r-project.org
>>> > > >>>> Subject: [R] Interpreting summary.lm for a 2 factor anova
>>> > > >>>>
>>> > > >>>> Dear all,
>>> > > >>>>
>>> > > >>>> Here is a small example : -
>>> > > >>>>
>>> > > >>>>> model <- aov(breaks ~ wool * tension, data = warpbreaks)
>>> > > >>>>> summary.lm(model)
>>> > > >>>>
>>> > > >>>> Call:
>>> > > >>>> aov(formula = breaks ~ wool * tension, data = warpbreaks)
>>> > > >>>>
>>> > > >>>> Residuals:
>>> > > >>>>    Min       1Q   Median       3Q      Max
>>> > > >>>> -19.5556  -6.8889  -0.6667   7.1944  25.4444
>>> > > >>>>
>>> > > >>>> Coefficients:
>>> > > >>>>              Estimate Std. Error t value Pr(>|t|)
>>> > > >>>> (Intercept)      44.556      3.647  12.218 2.43e-16 ***
>>> > > >>>> woolB           -16.333      5.157  -3.167 0.002677 **
>>> > > >>>> tensionM        -20.556      5.157  -3.986 0.000228 ***
>>> > > >>>> tensionH        -20.000      5.157  -3.878 0.000320 ***
>>> > > >>>> woolB:tensionM   21.111      7.294   2.895 0.005698 **
>>> > > >>>> woolB:tensionH   10.556      7.294   1.447 0.154327
>>> > > >>>> ---
>>> > > >>>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>> > > >>>>
>>> > > >>>> Residual standard error: 10.94 on 48 degrees of freedom
>>> > > >>>> Multiple R-squared:  0.3778,    Adjusted R-squared:  0.3129
>>> > > >>>> F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772
>>> > > >>>>
>>> > > >>>>> model.tables(model,"e")
>>> > > >>>> Tables of effects
>>> > > >>>>
>>> > > >>>> wool
>>> > > >>>> wool
>>> > > >>>>     A       B
>>> > > >>>> 2.8889 -2.8889
>>> > > >>>>
>>> > > >>>> tension
>>> > > >>>> tension
>>> > > >>>>    L      M      H
>>> > > >>>> 8.241 -1.759 -6.481
>>> > > >>>>
>>> > > >>>> wool:tension
>>> > > >>>>   tension
>>> > > >>>> wool L      M      H
>>> > > >>>>  A  5.278 -5.278  0.000
>>> > > >>>>  B -5.278  5.278  0.000
>>> > > >>>>
>>> > > >>>>
>>> > > >>>>> model.tables(model,"m")
>>> > > >>>> Tables of means
>>> > > >>>> Grand mean
>>> > > >>>>
>>> > > >>>> 28.14815
>>> > > >>>>
>>> > > >>>> wool
>>> > > >>>> wool
>>> > > >>>>    A      B
>>> > > >>>> 31.037 25.259
>>> > > >>>>
>>> > > >>>> tension
>>> > > >>>> tension
>>> > > >>>>   L     M     H
>>> > > >>>> 36.39 26.39 21.67
>>> > > >>>>
>>> > > >>>> wool:tension
>>> > > >>>>   tension
>>> > > >>>> wool L     M     H
>>> > > >>>>  A 44.56 24.00 24.56
>>> > > >>>>  B 28.22 28.78 18.78
>>> > > >>>>>
>>> > > >>>>
>>> > > >>>> I don't follow the output of summary.lm. I understand the output
>>> of
>>> > > >>>> model.tables for effects and means. For instance what does 44.556
>>> > > >>>> represent ? Is it the grand average ? The grand mean is
>>> 28.14815. Can
>>> > > >>>> someone help me understand the output of summary.lm ?
>>> > > >>>>
>>> > > >>>> Best Regards,
>>> > > >>>> Ashim
>>> > > >>>>
>>> > > >>>>     [[alternative HTML version deleted]]
>>> > > >>>>
>>> > > >>>> ______________________________________________
>>> > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>>> see
>>> > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> ng-
>>> > > >>>> guide.html
>>> > > >>>> and provide commented, minimal, self-contained, reproducible
>>> code.
>>> > > >>>
>>> > > >>> ________________________________
>>> > > >>> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné
>>> a jsou
>>> > > >>> určeny pouze jeho adresátům.
>>> > > >>> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
>>> > > >>> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a
>>> jeho
>>> > > kopie
>>> > > >>> vymažte ze svého systému.
>>> > > >>> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni
>>> tento
>>> > > email
>>> > > >>> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
>>> > > >>> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou
>>> > > modifikacemi
>>> > > >>> či zpožděním přenosu e-mailu.
>>> > > >>>
>>> > > >>> V případě, že je tento e-mail součástí obchodního jednání:
>>> > > >>> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o
>>> uzavření
>>> > > >>> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
>>> > > >>> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně
>>> > > přijmout;
>>> > > >>> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze
>>> strany
>>> > > >>> příjemce s dodatkem či odchylkou.
>>> > > >>> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
>>> > > >>> výslovným dosažením shody na všech jejích náležitostech.
>>> > > >>> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
>>> > > >>> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl
>>> písemně
>>> > > zmocněn
>>> > > >>> nebo písemně pověřen a takové pověření nebo plná moc byly
>>> > > tohoto
>>> > > >>> emailu případně osobě, kterou adresát zastupuje, předloženy nebo
>>> jejich
>>> > > >>> existence je adresátovi či osobě jím zastoupené známá.
>>> > > >>>
>>> > > >>> This e-mail and any documents attached to it may be confidential
>>> and
>>> > > are
>>> > > >>> intended only for its intended recipients.
>>> > > >>> If you received this e-mail by mistake, please immediately inform
>>> its
>>> > > >>> sender. Delete the contents of this e-mail with all attachments
>>> and its
>>> > > >>> copies from your system.
>>> > > >>> If you are not the intended recipient of this e-mail, you are not
>>> > > >>> authorized to use, disseminate, copy or disclose this e-mail in
>>> any
>>> > > manner.
>>> > > >>> The sender of this e-mail shall not be liable for any possible
>>> damage
>>> > > >>> caused by modifications of the e-mail or by delay with transfer
>>> of the
>>> > > >>> email.
>>> > > >>>
>>> > > >>> In case that this e-mail forms part of business dealings:
>>> > > >>> - the sender reserves the right to end negotiations about entering
>>> > > into a
>>> > > >>> contract in any time, for any reason, and without stating any
>>> > > reasoning.
>>> > > >>> - if the e-mail contains an offer, the recipient is entitled to
>>> > > >>> immediately accept such offer; The sender of this e-mail (offer)
>>> > > excludes
>>> > > >>> any acceptance of the offer on the part of the recipient
>>> containing any
>>> > > >>> amendment or variation.
>>> > > >>> - the sender insists on that the respective contract is concluded
>>> only
>>> > > >>> upon an express mutual agreement on all its aspects.
>>> > > >>> - the sender of this e-mail informs that he/she is not authorized
>>> to
>>> > > enter
>>> > > >>> into any contracts on behalf of the company except for cases in
>>> which
>>> > > >>> he/she is expressly authorized to do so in writing, and such
>>> > > authorization
>>> > > >>> or power of attorney is submitted to the recipient or the person
>>> > > >>> represented by the recipient, or the existence of such
>>> authorization is
>>> > > >>> known to the recipient of the person represented by the recipient.
>>> > > >>>
>>> > > >>
>>> > > >>      [[alternative HTML version deleted]]
>>> > > >>
>>> > > >> ______________________________________________
>>> > > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > > >> https://stat.ethz.ch/mailman/listinfo/r-help
>>> > > posting-guide.html
>>> > > >> and provide commented, minimal, self-contained, reproducible code.
>>> > > >
>>> > > > David Winsemius
>>> > > > Alameda, CA, USA
>>> > > >
>>> > > > ______________________________________________
>>> > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > > > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > > posting-guide.html
>>> > > > and provide commented, minimal, self-contained, reproducible code.
>>> > >
>>> > > David Winsemius
>>> > > Alameda, CA, USA
>>> > >
>>> > >
>>> >
>>> >       [[alternative HTML version deleted]]
>>> >
>>> > ______________________________________________
>>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> ng-guide.html
>>> > and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help