[R] [OT] 1 vs 2-way anova technical question

Bert Gunter gunter.berton at gene.com
Mon Nov 21 15:55:41 CET 2011


Giovanni:

1. Please read ?formula and/or An Introduction to R for how to specify
linear models in R.

2. Correct specification of what you want (if I understand correctly) is
log(R) ~ A*B + C + D

3. ... which presumably will also fail because some of your factors
have only one level, which means that you cannot use them in your
model.

4. ... which, in turn, suggests you don't know what your doing
statistically and should seek local assistance, especially in trying
to interpret a fit to an unbalanced model (you can't do it as you
probably think you can).

I should say in your defense that posts on this list indicate that
point 4 is a widely shared problem among posters here.

Cheers,
Bert

On Mon, Nov 21, 2011 at 5:02 AM, Giovanni Azua <bravegag at gmail.com> wrote:
> Hello,
>
> Couple of clarifications:
> - A,B,C,D are factors and I am also interested in possible interactions but the model that comes out from aov R~A*B*C*D violates the model assumptions
> - My 2^k is unbalanced i.e. missing data and an additional level I also include in one of the factors i.e. C
> - I was referring in the OP to the 4-way interactions and not 2-way, I'm sorry for my confusion.
> - I tried to create an aov model with less interactions this way but I get the following error:
>
> model.aov <- aov(log(R)~A+B+I(A*B)+C+D,data=throughput)
> Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") :
>  contrasts can be applied only to factors with 2 or more levels
> In addition: Warning message:
> In Ops.factor(A, B) : * not meaningful for factors
>
> Here I was trying to say: do a one-way anova except for the A and B factors for which I would like to get their 2-way interactions ...
>
> Thanks in advance,
> Best regards,
> Giovanni
>
> On Nov 21, 2011, at 12:04 PM, Giovanni Azua wrote:
>
>> Hello,
>>
>> I know there is plenty of people in this group who can give me a good answer :)
>>
>> I have a 2^k model where k=4 like this:
>> Model 1) R~A*B*C*D
>>
>> If I use the "*" in R among all elements it means to me to explore all interactions and include them in the model i.e. I think this would be the so called 2-way anova. However, if I do this, it leads to model violations i.e. the homoscedasticity is violated, the normality assumption of the sample errors i.e. residuals is violated etc. I tried correcting the issues using different standard transformations: log, sqrt, Box-Cox forms etc but none really improve the result. In this case even though the model assumptions do not hold, some of the interactions are found to significatively influence the response variable. But then shall I trust the results of this Model 1) given that the assumptions do not hold?
>>
>> Then I try this other model where I exclude the interactions (is this the 1-way anova?):
>> Model 2) R~A+B+C+D
>>
>> In this one the model assumptions hold except the existence of some outliers and a slightly heavy tail in the QQ-plot.
>>
>> Given that the assumptions for Model 1) do not hold, I assume I should ignore the results altogether for Model 1) or? or instead can I safely use the Sum Sq. of Model 1) to get my table of percent of variations?
>>
>> This to me was a bit counter-intuitive since I assumed that if there was collinearity among factors (and there is e.g. I(A*B*C)) the Model 1) and I included those interactions, my model would be more accurate ... ok this turned into a brand new topic of model selection but I am mostly interested in the question: if model is violated can I or must I not use the results e.g. Sum Sqr for that model?
>>
>> Can anyone advice please?
>>
>> btw I have bought most books on R and statistical analysis. I have researched them all and the ANOVA coverage is very shallow in most of them specially in the R-sy ones, they just offer a slightly pimped up version of the R-help.
>>
>> I am also unofficially following a course on ANOVA from the university I am registered in and most examples are too simplistic and either the assumptions just hold easily or the assumptions don't hold and nothing happens.
>>
>> Thanks in advance,
>> Best regards,
>> Giovanni
>>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list