Amasco,
In general it is dangerous to attempt to interpret a main effect that
is included in an interaction, regardless of wether or not the
interaction is significant. If you want to make a valid inference about
a main effect it is safest to do so after dropping any interaction that
contains the main effect. Since you would not want to drop a significant
interaction, you should not try to interpret a main effect in the
presence of a significant interaction that contains the main effect. If
the interaction is not significant drop the interaction, re-run the
model and then look at the main effect.
John
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence
University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin@grecc.umaryland.edu
>>> "Amasco Miralisus" 8/28/2006 3:20 PM >>>
Hello,
First of all, I would like to thank everybody who answered my
question. Every post has added something to my knowledge of the topic.
I now know why Type III SS are so questionable.
As I understood form R FAQ, there is disagreement among Statisticians
which SS to use
(http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-does-the-output-from-anova_0028_0029-depend-on-the-order-of-factors-in-the-model_003f).
However, most commercial statistical packages use Type III as the
default (with orthogonal contrasts), just as STATISTICA, from which I
am currently trying to migrate to R. This was probably was done for
the convenience of end-users who are not very experienced in
theoretical statistics.
I am aware that the same result could be produced using the standard
anova() function with Type I "sequential" SS, supplemented by drop1()
function, but this approach will look quite complicated for persons
without any substantial background in statistics, like no-math
students. I would prefer easier way, possibly more universal, though
also probably more "for dummies" :) If am not mistaken, car package by
John Fox with his nice Anova() function is the reasonable alternative
for any, who wish to simply perform quick statistical analysis,
without afraid to mess something with model fitting. Of course
orthogonal contrasts have to be specified (for example contr.sum) in
case of Type III SS.
Therefore, I would like to reformulate my questions, to make it easier
for you to answer:
1. The first question related to answer by Professor Brian Ripley: Did
I understood correctly from the advised paper (Bill Venables'
'exegeses' paper) that there is not much sense to test main effects if
the interaction is significant?
2. If I understood the post by John Fox correctly, I could safely use
Anova(.,type="III") function from car for ANOVA analyses in R, both
for balanced and unbalanced designs? Of course providing the model was
fitted with orthogonal contrasts. Something like below:
mod <- aov(response ~ factor1 * factor2, data=mydata,
contrasts=list(factor1=contr.sum,
factor2=contr.sum))
Anova(mod, type="III")
It was also said in most of your posts that the decision of which of
Type of SS to use has to be done on the basis of the hypothesis we
want to test. Therefore, let's assume that I would like to test the
significance of both factors, and if some of them significant, I plan
to use post-hoc tests to explore difference(s) between levels of this
significant factor(s).
Thank you in advance, Amasco
On 8/27/06, John Fox wrote:
> Dear Amasco,
>
> A complete explanation of the issues that you raise is awkward in an
email,
> so I'll address your questions briefly. Section 8.2 of my text,
Applied
> Regression Analysis, Linear Models, and Related Methods (Sage, 1997)
has a
> detailed discussion.
>
> (1) In balanced designs, so-called "Type I," "II," and "III" sums of
squares
> are identical. If the STATA manual says that Type II tests are only
> appropriate in balanced designs, then that doesn't make a whole lot
of sense
> (unless one believes that Type-II tests are nonsense, which is not
the
> case).
>
> (2) One should concentrate not directly on different "types" of sums
of
> squares, but on the hypotheses to be tested. Sums of squares and
F-tests
> should follow from the hypotheses. Type-II and Type-III tests (if the
latter
> are properly formulated) test hypotheses that are reasonably
construed as
> tests of main effects and interactions in unbalanced designs. In
unbalanced
> designs, Type-I sums of squares usually test hypotheses of interest
only by
> accident.
>
> (3) Type-II sums of squares are constructed obeying the principle of
> marginality, so the kinds of contrasts employed to represent factors
are
> irrelevant to the sums of squares produced. You get the same answer
for any
> full set of contrasts for each factor. In general, the hypotheses
tested
> assume that terms to which a particular term is marginal are zero.
So, for
> example, in a three-way ANOVA with factors A, B, and C, the Type-II
test for
> the AB interaction assumes that the ABC interaction is absent, and
the test
> for the A main effect assumes that the ABC, AB, and AC interaction
are
> absent (but not necessarily the BC interaction, since the A main
effect is
> not marginal to this term). A general justification is that we're
usually
> not interested, e.g., in a main effect that's marginal to a nonzero
> interaction.
>
> (4) Type-III tests do not assume that terms higher-order to the term
in
> question are zero. For example, in a two-way design with factors A
and B,
> the type-III test for the A main effect tests whether the population
> marginal means at the levels of A (i.e., averaged across the levels
of B)
> are the same. One can test this hypothesis whether or not A and B
interact,
> since the marginal means can be formed whether or not the profiles of
means
> for A within levels of B are parallel. Whether the hypothesis is of
interest
> in the presence of interaction is another matter, however. To
compute
> Type-III tests using incremental F-tests, one needs contrasts that
are
> orthogonal in the row-basis of the model matrix. In R, this means,
e.g.,
> using contr.sum, contr.helmert, or contr.poly (all of which will give
you
> the same SS), but not contr.treatment. Failing to be careful here
will
> result in testing hypotheses that are not reasonably construed, e.g.,
as
> hypotheses concerning main effects.
>
> (5) The same considerations apply to linear models that include
quantitative
> predictors -- e.g., ANCOVA. Most software will not automatically
produce
> sensible Type-III tests, however.
>
> I hope this helps,
> John
>
> --------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 905-525-9140x23604
> http://socserv.mcmaster.ca/jfox
> --------------------------------
>
> > -----Original Message-----
> > From: r-help-bounces@stat.math.ethz.ch
> > [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Amasco
> > Miralisus
> > Sent: Saturday, August 26, 2006 5:07 PM
> > To: r-help@stat.math.ethz.ch
> > Subject: [R] Type II and III sum of square in Anova (R, car
package)
> >
> > Hello everybody,
> >
> > I have some questions on ANOVA in general and on ANOVA in R
> > particularly.
> > I am not Statistician, therefore I would be very appreciated
> > if you answer it in a simple way.
> >
> > 1. First of all, more general question. Standard anova()
> > function for lm() or aov() models in R implements Type I sum
> > of squares (sequential), which is not well suited for
> > unbalanced ANOVA. Therefore it is better to use
> > Anova() function from car package, which was programmed by
> > John Fox to use Type II and Type III sum of squares. Did I
> > get the point?
> >
> > 2. Now more specific question. Type II sum of squares is not
> > well suited for unbalanced ANOVA designs too (as stated in
> > STATISTICA help), therefore the general rule of thumb is to
> > use Anova() function using Type II SS only for balanced ANOVA
> > and Anova() function using Type III SS for unbalanced ANOVA?
> > Is this correct interpretation?
> >
> > 3. I have found a post from John Fox in which he wrote that
> > Type III SS could be misleading in case someone use some
> > contrasts. What is this about?
> > Could you please advice, when it is appropriate to use Type
> > II and when Type III SS? I do not use contrasts for
> > comparisons, just general ANOVA with subsequent Tukey
> > post-hoc comparisons.
> >
> > Thank you in advance,
> > Amasco
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Confidentiality Statement:
This email message, including any attachments, is\ for the s...{{dropped}}