[R] Two-way Unbalanced multiple sample ANOVA
Gregory Hughes
ghug2646 at postoffice.uri.edu
Wed Mar 7 17:28:47 CET 2007
Hello all,
I was wondering if anyone could help me formulate a Two-way ANOVA for
unbalanced multiple sample data?
We have a new study method aimed to help students to study for tests
using computers. (I am a computer scientists, hence my
soon-to-be-apparent lack of statistical knowledge).
To test this study method we devised a user study where 30 participant
attended 2 lectures, lecture1 and lecture2. Two test were created, test1
and test2.
test1 corresponds to the material in lecture1 and test2 corresponds to
the material in lecture2.
The 30 participants were split into two groups, group1 and group2.
group1 used our new study method to review for lecture1 and their
existing study method to review the material from lecture2
group2 used our new study method to review for lecture2 and their
existing study method to review the material from lecture1
Each group then took the two test.
This is a repeated measure experiment because we have 2 exam scores for
each participant, one using our new method to study and one not using
our new method to study.
The data is unbalanced because participants did not take the same test
twice.
From what I understand balanced data would look like
ID TEST SYSTEM SCORE
1 1 1 80
1 1 0 70
1 2 1 90
1 2 0 95
2 1 1 70
2 1 0 75
2 2 1 80
2 2 0 75
But instead our data look like this:
ID TEST SYSTEM SCORE
1 1 1 80
1 2 0 95
2 1 0 75
2 2 1 80
So participant 2 never took test1 using our system.
Anyway, I want to look to see if our new study method had an impact one
test results. Also, I want to see if the test number had an impact on
the exam results.
Here is some sample data:
------------
>dataSet <- data.frame(
particID=factor(c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8)),
whichExam=factor(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)),
studyMethod=factor(c(1,0,1,0,1,0,1,0,0,1,0,1,0,1,0,1)),
score=c(90,80,75,70,70,58,73,68,69,87,68,79,80,80,99,95))
------------
From what I have read this should be how to compute and ANOVA on this data:
------------
> summary(aov(score~whichExam*studyMethod+Error(particID),data=dataSet))
Error: particID
Df Sum Sq Mean Sq F value Pr(>F)
whichExam:studyMethod 1 333.06 333.06 1.8211 0.2259
Residuals 6 1097.38 182.90
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
whichExam 1 3.062 3.062 0.1072 0.75445
studyMethod 1 203.062 203.062 7.1094 0.03721 *
Residuals 6 171.375 28.562
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
------------
Is this correct way do do an ANOVA test for this data?
From what I can tell this means that the study method did have a
statistically significant impact on the scores, is that correct? This
also shows that it did not matter which test the subject took, meaning
that the two test were equally difficult.
What exactly do the titles "Error ..." mean?
What are "Residuals"?
Can anyone recommend a good book on R which covers this information, all
I can find are books on SPSS?
More information about the R-help
mailing list