[R] Interpreting summary.lm for a 2 factor anova
Ashim Kapoor
ashimkapoor at gmail.com
Sun Dec 4 05:33:07 CET 2016
Dear Sir,
Many thanks for the explanation. Prior to your email (with some help from a
friend of mine) I was able to figure this one out. If we look at the model
: -
y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4. woolB.TensionM
+ B5.woolB.TensionH + error
Here woolB, tensionM, tensionH are the dummy indicator variables similar to
how you have defined them.
Now suppose we consider y1,..,yn, all in group A.L (say).
Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 + 0 +
0 + 0 + 0.
What was confusing me was how to compute the cell mean in woolB,tensionH
cell.
If we have y_1,...,y_n all in group B.H then :-
y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 + B5
Therefore average of group B.H = intercept + B1 + B3 + B5
Many thanks and Best Regards,
Ashim
On Sat, Dec 3, 2016 at 7:15 PM, Fox, John <jfox at mcmaster.ca> wrote:
> Dear Ashim,
>
> Sorry to chime in late, and my apologies if someone has already pointed
> this out, but here's the relationship between the cell means and the model
> coefficients, using the row-basis of the model matrix:
>
> -------------------------- snip ------------------------
>
> > means <- with( warpbreaks, tapply( breaks, interaction(wool, tension),
> mean ) )
> > x.A <- rep(c(0, 1), 3)
> > x.B1 <- rep(c(0, 1, 0), each=2)
> > x.B2 <- rep(c(0, 0, 1), each=2)
> > x.AB1 <- x.A*x.B1
> > x.AB2 <- x.A*x.B2
> > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2)
> > X.basis
> x.A x.B1 x.B2 x.AB1 x.AB2
> [1,] 1 0 0 0 0 0
> [2,] 1 1 0 0 0 0
> [3,] 1 0 1 0 0 0
> [4,] 1 1 1 0 1 0
> [5,] 1 0 0 1 0 0
> [6,] 1 1 0 1 0 1
> > solve(X.basis, means)
> x.A x.B1 x.B2 x.AB1 x.AB2
> 44.55556 -16.33333 -20.55556 -20.00000 21.11111 10.55556
> > coef(aov(breaks ~ wool * tension, data = warpbreaks))
> (Intercept) woolB tensionM tensionH woolB:tensionM
> 44.55556 -16.33333 -20.55556 -20.00000 21.11111
> woolB:tensionH
> 10.55556
>
> -------------------------- snip ------------------------
>
> I hope this helps,
> John
>
> -----------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
>
>
>
> > -----Original Message-----
> > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashim
> Kapoor
> > Sent: December 3, 2016 12:19 AM
> > To: David Winsemius <dwinsemius at comcast.net>
> > Cc: r-help at r-project.org
> > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova
> >
> > Please allow me to rephrase myquery.
> >
> > > model.tables(model,"m")
> > Tables of means
> > Grand mean
> >
> > 28.14815
> >
> > wool
> > wool
> > A B
> > 31.037 25.259
> >
> > tension
> > tension
> > L M H
> > 36.39 26.39 21.67
> >
> > wool:tension
> > tension
> > wool L M H
> > A 44.56 24.00 24.56
> > B 28.22 28.78 18.78
> > >
> >
> >
> > The above is the same as :
> >
> > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
> > A.L B.L A.M B.M A.H B.H
> > 44.55556 28.22222 24.00000 28.77778 24.55556 18.77778
> >
> > For reference:
> >
> > > model <- aov(breaks ~ wool * tension, data = warpbreaks)
> > > summary.lm(model)
> >
> > Call:
> > aov(formula = breaks ~ wool * tension, data = warpbreaks)
> >
> > Residuals:
> > Min 1Q Median 3Q Max
> > -19.5556 -6.8889 -0.6667 7.1944 25.4444
> >
> > Coefficients:
> > Estimate Std. Error t value Pr(>|t|)
> > (Intercept) 44.556 3.647 12.218 2.43e-16 ***
> > woolB -16.333 5.157 -3.167 0.002677 **
> > tensionM -20.556 5.157 -3.986 0.000228 ***
> > tensionH -20.000 5.157 -3.878 0.000320 ***
> > woolB:tensionM 21.111 7.294 2.895 0.005698 **
> > woolB:tensionH 10.556 7.294 1.447 0.154327
> > ---
> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Residual standard error: 10.94 on 48 degrees of freedom
> > Multiple R-squared: 0.3778, Adjusted R-squared: 0.3129
> > F-statistic: 5.828 on 5 and 48 DF, p-value: 0.0002772
> >
> >
> > Now I'll explain what is confusing me in the output of summary.lm.
> >
> > Coeff of Intercept = 44.556 = cell mean for A.L. This is the base.
> >
> > Coeff of woolB:L = -16.333 = 28.22222 - 44.556. This is the difference
> of this
> > cell mean(B:L) from the base.
> >
> > Coeff of woolA:tensionM = -20.556 = 24.000- 44.556. This is the
> difference of
> > this cell mean (A:M) from the base.
> >
> > Coeff of woolA:tensionH = -20.000 = 24.55556 - 44.556. This is the
> difference
> > of this cell mean(A:H) from the base.
> >
> > This is where it stops being the difference from the base.
> >
> > Coeff of woolB:tensionM = 21.111 should turn out to be 28.77778 - 44.556
> but
> > this is -15.77822
> >
> > Coeff of woolB:tensionH = 10.556 should turn out to be 18.77778 -
> 44.556 but
> > this is -25.77822
> >
> > In the above 2 cases, we can't say that the coefficient = cell mean -
> base case.
> > Can you tell me what should be the statement to be made ?
> >
> >
> > Best Regards,
> > Ashim
> >
> > PS : My apologies for emailing my query to this list. Can you tell me
> the names
> > of a few (active) statistics help list ?
> >
> > On Sat, Dec 3, 2016 at 1:33 AM, David Winsemius <dwinsemius at comcast.net>
> > wrote:
> >
> > >
> > > > On Dec 2, 2016, at 9:09 AM, David Winsemius <dwinsemius at comcast.net>
> > > wrote:
> > > >
> > > >>
> > > >> On Dec 2, 2016, at 6:16 AM, Ashim Kapoor <ashimkapoor at gmail.com>
> > wrote:
> > > >>
> > > >> Dear Pikal,
> > > >>
> > > >> All levels except the interactions are compared to the Intercept.
> > > >> I'm a little confused as to what's going on in interaction terms
> > > >> eg. the cell wool B : tension M. It's mean is :
> > > >> 28.78 and 28.78 - 44.56 = -15.78 != 21.111.
> > > >>
> > > >> It's something like 44.56 (intercept) -16.333 (wool B) -.20.556
> > > >> (tension
> > > >> M) + 21.111 (woolB:tensionM) = 28.782.
> > > >>
> > > >> I don't know how to sum up the above line in terms of differences
> > > >> succinctly.
> > > >
> > > > The aov estimate will not exactly equal the observed mean (this is
> > > _statistics_ after all). You should be comparing the mean of that cell
> > > to the estimate:
> > > >
> > > > 44.556 + (-16.33) +(-20.556) + (21.11)
> > >
> > > A respected participant advised me to look at this more closely. In
> > > this case (and I think in most such cases) where there are the same
> > > number of parameters as there are means, the model is "saturated" and
> > > there is no
> > > difference:
> > >
> > > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
> > > A.L B.L A.M B.M A.H B.H
> > > 44.55556 28.22222 24.00000 28.77778 24.55556 18.77778
> > >
> > > So the B:M estimate is identical up to rounding with the observed mean:
> > >
> > > 44.556 + (-16.33) +(-20.556) + (21.11) [1] 28.78
> > >
> > >
> > >
> > > >
> > > > The difference between the observed mean and the estimated mean is
> > known
> > > as a 'residual'
> > >
> > > I've also been privately but gently chided for this misstatement.
> > > Residuals are the difference between data and estimates.
> > >
> > > > and the squared sum of the all residuals is what this being minimized
> > > ... over all the cells including the one implicitly associated with the
> > > Intercept.
> > > >
> > > > This isn't really on-topic for Rhelp since you are not having
> difficulty
> > > in getting the R program to perform its duties, but are rather in need
> of
> > > statistical education. That not what this mailing list is set up for.
> > > >
> > > > --
> > > > David.
> > > >
> > > >>
> > > >>>
> > > >>>> -----Original Message-----
> > > >>>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
> Ashim
> > > >>>> Kapoor
> > > >>>> Sent: Thursday, December 1, 2016 2:48 PM
> > > >>>> To: r-help at r-project.org
> > > >>>> Subject: [R] Interpreting summary.lm for a 2 factor anova
> > > >>>>
> > > >>>> Dear all,
> > > >>>>
> > > >>>> Here is a small example : -
> > > >>>>
> > > >>>>> model <- aov(breaks ~ wool * tension, data = warpbreaks)
> > > >>>>> summary.lm(model)
> > > >>>>
> > > >>>> Call:
> > > >>>> aov(formula = breaks ~ wool * tension, data = warpbreaks)
> > > >>>>
> > > >>>> Residuals:
> > > >>>> Min 1Q Median 3Q Max
> > > >>>> -19.5556 -6.8889 -0.6667 7.1944 25.4444
> > > >>>>
> > > >>>> Coefficients:
> > > >>>> Estimate Std. Error t value Pr(>|t|)
> > > >>>> (Intercept) 44.556 3.647 12.218 2.43e-16 ***
> > > >>>> woolB -16.333 5.157 -3.167 0.002677 **
> > > >>>> tensionM -20.556 5.157 -3.986 0.000228 ***
> > > >>>> tensionH -20.000 5.157 -3.878 0.000320 ***
> > > >>>> woolB:tensionM 21.111 7.294 2.895 0.005698 **
> > > >>>> woolB:tensionH 10.556 7.294 1.447 0.154327
> > > >>>> ---
> > > >>>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> > > >>>>
> > > >>>> Residual standard error: 10.94 on 48 degrees of freedom
> > > >>>> Multiple R-squared: 0.3778, Adjusted R-squared: 0.3129
> > > >>>> F-statistic: 5.828 on 5 and 48 DF, p-value: 0.0002772
> > > >>>>
> > > >>>>> model.tables(model,"e")
> > > >>>> Tables of effects
> > > >>>>
> > > >>>> wool
> > > >>>> wool
> > > >>>> A B
> > > >>>> 2.8889 -2.8889
> > > >>>>
> > > >>>> tension
> > > >>>> tension
> > > >>>> L M H
> > > >>>> 8.241 -1.759 -6.481
> > > >>>>
> > > >>>> wool:tension
> > > >>>> tension
> > > >>>> wool L M H
> > > >>>> A 5.278 -5.278 0.000
> > > >>>> B -5.278 5.278 0.000
> > > >>>>
> > > >>>>
> > > >>>>> model.tables(model,"m")
> > > >>>> Tables of means
> > > >>>> Grand mean
> > > >>>>
> > > >>>> 28.14815
> > > >>>>
> > > >>>> wool
> > > >>>> wool
> > > >>>> A B
> > > >>>> 31.037 25.259
> > > >>>>
> > > >>>> tension
> > > >>>> tension
> > > >>>> L M H
> > > >>>> 36.39 26.39 21.67
> > > >>>>
> > > >>>> wool:tension
> > > >>>> tension
> > > >>>> wool L M H
> > > >>>> A 44.56 24.00 24.56
> > > >>>> B 28.22 28.78 18.78
> > > >>>>>
> > > >>>>
> > > >>>> I don't follow the output of summary.lm. I understand the output
> of
> > > >>>> model.tables for effects and means. For instance what does 44.556
> > > >>>> represent ? Is it the grand average ? The grand mean is 28.14815.
> Can
> > > >>>> someone help me understand the output of summary.lm ?
> > > >>>>
> > > >>>> Best Regards,
> > > >>>> Ashim
> > > >>>>
> > > >>>> [[alternative HTML version deleted]]
> > > >>>>
> > > >>>> ______________________________________________
> > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >>>> PLEASE do read the posting guide http://www.R-project.org/
> posting-
> > > >>>> guide.html
> > > >>>> and provide commented, minimal, self-contained, reproducible code.
> > > >>>
> > > >>> ________________________________
> > > >>> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a
> jsou
> > > >>> určeny pouze jeho adresátům.
> > > >>> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> > > >>> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a
> jeho
> > > kopie
> > > >>> vymažte ze svého systému.
> > > >>> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni
> tento
> > > email
> > > >>> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> > > >>> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou
> > > modifikacemi
> > > >>> či zpožděním přenosu e-mailu.
> > > >>>
> > > >>> V případě, že je tento e-mail součástí obchodního jednání:
> > > >>> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> > > >>> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> > > >>> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně
> > > přijmout;
> > > >>> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze
> strany
> > > >>> příjemce s dodatkem či odchylkou.
> > > >>> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
> > > >>> výslovným dosažením shody na všech jejích náležitostech.
> > > >>> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
> > > >>> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně
> > > zmocněn
> > > >>> nebo písemně pověřen a takové pověření nebo plná moc byly
> adresátovi
> > > tohoto
> > > >>> emailu případně osobě, kterou adresát zastupuje, předloženy nebo
> jejich
> > > >>> existence je adresátovi či osobě jím zastoupené známá.
> > > >>>
> > > >>> This e-mail and any documents attached to it may be confidential
> and
> > > are
> > > >>> intended only for its intended recipients.
> > > >>> If you received this e-mail by mistake, please immediately inform
> its
> > > >>> sender. Delete the contents of this e-mail with all attachments
> and its
> > > >>> copies from your system.
> > > >>> If you are not the intended recipient of this e-mail, you are not
> > > >>> authorized to use, disseminate, copy or disclose this e-mail in any
> > > manner.
> > > >>> The sender of this e-mail shall not be liable for any possible
> damage
> > > >>> caused by modifications of the e-mail or by delay with transfer of
> the
> > > >>> email.
> > > >>>
> > > >>> In case that this e-mail forms part of business dealings:
> > > >>> - the sender reserves the right to end negotiations about entering
> > > into a
> > > >>> contract in any time, for any reason, and without stating any
> > > reasoning.
> > > >>> - if the e-mail contains an offer, the recipient is entitled to
> > > >>> immediately accept such offer; The sender of this e-mail (offer)
> > > excludes
> > > >>> any acceptance of the offer on the part of the recipient
> containing any
> > > >>> amendment or variation.
> > > >>> - the sender insists on that the respective contract is concluded
> only
> > > >>> upon an express mutual agreement on all its aspects.
> > > >>> - the sender of this e-mail informs that he/she is not authorized
> to
> > > enter
> > > >>> into any contracts on behalf of the company except for cases in
> which
> > > >>> he/she is expressly authorized to do so in writing, and such
> > > authorization
> > > >>> or power of attorney is submitted to the recipient or the person
> > > >>> represented by the recipient, or the existence of such
> authorization is
> > > >>> known to the recipient of the person represented by the recipient.
> > > >>>
> > > >>
> > > >> [[alternative HTML version deleted]]
> > > >>
> > > >> ______________________________________________
> > > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> PLEASE do read the posting guide http://www.R-project.org/
> > > posting-guide.html
> > > >> and provide commented, minimal, self-contained, reproducible code.
> > > >
> > > > David Winsemius
> > > > Alameda, CA, USA
> > > >
> > > > ______________________________________________
> > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide http://www.R-project.org/
> > > posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > David Winsemius
> > > Alameda, CA, USA
> > >
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list