[R] Interpreting summary.lm for a 2 factor anova

Ashim Kapoor ashimkapoor at gmail.com
Sat Dec 3 06:18:39 CET 2016


Please allow me to rephrase myquery.

> model.tables(model,"m")
Tables of means
Grand mean

28.14815

 wool
wool
     A      B
31.037 25.259

 tension
tension
    L     M     H
36.39 26.39 21.67

 wool:tension
    tension
wool L     M     H
   A 44.56 24.00 24.56
   B 28.22 28.78 18.78
>


The above is the same as :

with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
     A.L      B.L      A.M      B.M      A.H      B.H
44.55556 28.22222 24.00000 28.77778 24.55556 18.77778

For reference:

> model <- aov(breaks ~ wool * tension, data = warpbreaks)
> summary.lm(model)

Call:
aov(formula = breaks ~ wool * tension, data = warpbreaks)

Residuals:
     Min       1Q   Median       3Q      Max
-19.5556  -6.8889  -0.6667   7.1944  25.4444

Coefficients:
               Estimate Std. Error t value Pr(>|t|)
(Intercept)      44.556      3.647  12.218 2.43e-16 ***
woolB           -16.333      5.157  -3.167 0.002677 **
tensionM        -20.556      5.157  -3.986 0.000228 ***
tensionH        -20.000      5.157  -3.878 0.000320 ***
woolB:tensionM   21.111      7.294   2.895 0.005698 **
woolB:tensionH   10.556      7.294   1.447 0.154327
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 10.94 on 48 degrees of freedom
Multiple R-squared:  0.3778,    Adjusted R-squared:  0.3129
F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772


Now I'll explain what is confusing me in the output of summary.lm.

Coeff of Intercept = 44.556  = cell mean for A.L. This is the base.

Coeff of woolB:L = -16.333 = 28.22222 - 44.556. This is the difference of
this cell mean(B:L) from the base.

Coeff of woolA:tensionM = -20.556  = 24.000- 44.556. This is the difference
of this cell mean (A:M)  from the base.

Coeff of woolA:tensionH = -20.000  = 24.55556 - 44.556. This is the
difference of this cell mean(A:H) from the base.

This is where it stops being the difference from the base.

Coeff of woolB:tensionM = 21.111 should turn out to be 28.77778 - 44.556
but this is -15.77822

Coeff of woolB:tensionH = 10.556 should turn out to be  18.77778 - 44.556
but this is -25.77822

In the above 2 cases, we can't say that the coefficient = cell mean - base
case. Can you tell me what should be the statement to be made ?


Best Regards,
Ashim

PS : My apologies for emailing my query to this list. Can you tell me the
names of a few (active) statistics help list ?

On Sat, Dec 3, 2016 at 1:33 AM, David Winsemius <dwinsemius at comcast.net>
wrote:

>
> > On Dec 2, 2016, at 9:09 AM, David Winsemius <dwinsemius at comcast.net>
> wrote:
> >
> >>
> >> On Dec 2, 2016, at 6:16 AM, Ashim Kapoor <ashimkapoor at gmail.com> wrote:
> >>
> >> Dear Pikal,
> >>
> >> All levels except the interactions are compared to the Intercept. I'm a
> >> little confused as to what's going on in interaction terms eg. the cell
> >> wool B : tension M. It's mean is :
> >> 28.78 and 28.78 - 44.56 = -15.78 != 21.111.
> >>
> >> It's something like 44.56 (intercept) -16.333 (wool B) -.20.556 (tension
> >> M)  + 21.111 (woolB:tensionM) = 28.782.
> >>
> >> I don't know how to sum up the above line in terms of differences
> >> succinctly.
> >
> > The aov estimate will not exactly equal the observed mean (this is
> _statistics_ after all). You should be comparing the mean of that cell to
> the estimate:
> >
> > 44.556 + (-16.33) +(-20.556) + (21.11)
>
> A respected participant advised me to look at this more closely. In this
> case (and I think in most such cases)  where there are the same number of
> parameters as there are means, the model is "saturated" and there is no
> difference:
>
>  with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
>      A.L      B.L      A.M      B.M      A.H      B.H
> 44.55556 28.22222 24.00000 28.77778 24.55556 18.77778
>
> So the B:M estimate is identical up to rounding with the observed mean:
>
>  44.556 + (-16.33) +(-20.556) + (21.11)
> [1] 28.78
>
>
>
> >
> > The difference between the observed mean and the estimated mean is known
> as a 'residual'
>
> I've also been privately but gently chided for this misstatement.
> Residuals are the difference between data and estimates.
>
> > and the squared sum of the all residuals is what this being minimized
> ... over all the cells including the one implicitly associated with the
> Intercept.
> >
> > This isn't really on-topic for Rhelp since you are not having difficulty
> in getting the R program to perform its duties, but are rather in need of
> statistical education. That not what this mailing list is set up for.
> >
> > --
> > David.
> >
> >>
> >>>
> >>>> -----Original Message-----
> >>>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashim
> >>>> Kapoor
> >>>> Sent: Thursday, December 1, 2016 2:48 PM
> >>>> To: r-help at r-project.org
> >>>> Subject: [R] Interpreting summary.lm for a 2 factor anova
> >>>>
> >>>> Dear all,
> >>>>
> >>>> Here is a small example : -
> >>>>
> >>>>> model <- aov(breaks ~ wool * tension, data = warpbreaks)
> >>>>> summary.lm(model)
> >>>>
> >>>> Call:
> >>>> aov(formula = breaks ~ wool * tension, data = warpbreaks)
> >>>>
> >>>> Residuals:
> >>>>    Min       1Q   Median       3Q      Max
> >>>> -19.5556  -6.8889  -0.6667   7.1944  25.4444
> >>>>
> >>>> Coefficients:
> >>>>              Estimate Std. Error t value Pr(>|t|)
> >>>> (Intercept)      44.556      3.647  12.218 2.43e-16 ***
> >>>> woolB           -16.333      5.157  -3.167 0.002677 **
> >>>> tensionM        -20.556      5.157  -3.986 0.000228 ***
> >>>> tensionH        -20.000      5.157  -3.878 0.000320 ***
> >>>> woolB:tensionM   21.111      7.294   2.895 0.005698 **
> >>>> woolB:tensionH   10.556      7.294   1.447 0.154327
> >>>> ---
> >>>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>>>
> >>>> Residual standard error: 10.94 on 48 degrees of freedom
> >>>> Multiple R-squared:  0.3778,    Adjusted R-squared:  0.3129
> >>>> F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772
> >>>>
> >>>>> model.tables(model,"e")
> >>>> Tables of effects
> >>>>
> >>>> wool
> >>>> wool
> >>>>     A       B
> >>>> 2.8889 -2.8889
> >>>>
> >>>> tension
> >>>> tension
> >>>>    L      M      H
> >>>> 8.241 -1.759 -6.481
> >>>>
> >>>> wool:tension
> >>>>   tension
> >>>> wool L      M      H
> >>>>  A  5.278 -5.278  0.000
> >>>>  B -5.278  5.278  0.000
> >>>>
> >>>>
> >>>>> model.tables(model,"m")
> >>>> Tables of means
> >>>> Grand mean
> >>>>
> >>>> 28.14815
> >>>>
> >>>> wool
> >>>> wool
> >>>>    A      B
> >>>> 31.037 25.259
> >>>>
> >>>> tension
> >>>> tension
> >>>>   L     M     H
> >>>> 36.39 26.39 21.67
> >>>>
> >>>> wool:tension
> >>>>   tension
> >>>> wool L     M     H
> >>>>  A 44.56 24.00 24.56
> >>>>  B 28.22 28.78 18.78
> >>>>>
> >>>>
> >>>> I don't follow the output of summary.lm. I understand the output of
> >>>> model.tables for effects and means. For instance what does 44.556
> >>>> represent ? Is it the grand average ? The grand mean is 28.14815. Can
> >>>> someone help me understand the output of summary.lm ?
> >>>>
> >>>> Best Regards,
> >>>> Ashim
> >>>>
> >>>>     [[alternative HTML version deleted]]
> >>>>
> >>>> ______________________________________________
> >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >>>> guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>> ________________________________
> >>> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> >>> určeny pouze jeho adresátům.
> >>> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> >>> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho
> kopie
> >>> vymažte ze svého systému.
> >>> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento
> email
> >>> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> >>> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou
> modifikacemi
> >>> či zpožděním přenosu e-mailu.
> >>>
> >>> V případě, že je tento e-mail součástí obchodního jednání:
> >>> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> >>> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> >>> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně
> přijmout;
> >>> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
> >>> příjemce s dodatkem či odchylkou.
> >>> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
> >>> výslovným dosažením shody na všech jejích náležitostech.
> >>> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
> >>> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně
> zmocněn
> >>> nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi
> tohoto
> >>> emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich
> >>> existence je adresátovi či osobě jím zastoupené známá.
> >>>
> >>> This e-mail and any documents attached to it may be confidential and
> are
> >>> intended only for its intended recipients.
> >>> If you received this e-mail by mistake, please immediately inform its
> >>> sender. Delete the contents of this e-mail with all attachments and its
> >>> copies from your system.
> >>> If you are not the intended recipient of this e-mail, you are not
> >>> authorized to use, disseminate, copy or disclose this e-mail in any
> manner.
> >>> The sender of this e-mail shall not be liable for any possible damage
> >>> caused by modifications of the e-mail or by delay with transfer of the
> >>> email.
> >>>
> >>> In case that this e-mail forms part of business dealings:
> >>> - the sender reserves the right to end negotiations about entering
> into a
> >>> contract in any time, for any reason, and without stating any
> reasoning.
> >>> - if the e-mail contains an offer, the recipient is entitled to
> >>> immediately accept such offer; The sender of this e-mail (offer)
> excludes
> >>> any acceptance of the offer on the part of the recipient containing any
> >>> amendment or variation.
> >>> - the sender insists on that the respective contract is concluded only
> >>> upon an express mutual agreement on all its aspects.
> >>> - the sender of this e-mail informs that he/she is not authorized to
> enter
> >>> into any contracts on behalf of the company except for cases in which
> >>> he/she is expressly authorized to do so in writing, and such
> authorization
> >>> or power of attorney is submitted to the recipient or the person
> >>> represented by the recipient, or the existence of such authorization is
> >>> known to the recipient of the person represented by the recipient.
> >>>
> >>
> >>      [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > David Winsemius
> > Alameda, CA, USA
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list