[R] fixed effect significance with lmer() vs. t-test
dej500 at york.ac.uk
dej500 at york.ac.uk
Sat Jul 19 17:14:56 CEST 2008
I am looking at data of the following structure:
n <- 100
dataset <- data.frame(gender=NULL,subject=NULL,outcome=NULL)
for (i in 1:n){
gender <- c(rep("m",5),rep("f",5))
subject <- letters[1:10]
outcome <- c(rbinom(5,1,0.6),rbinom(5,1,0.4))
dataset <- rbind(dataset,cbind(gender,subject,outcome))}
I am interested in the significance of the fixed effect, gender. So I
compare:
one <- lmer(outcome~(1|subject),dataset,binomial)
two <- lmer(outcome~gender+(1|subject),dataset,binomial)
anova(one,two)
I inspect the p-value given under anova(one,two).
Note that usually lmer() -- correctly, since the only difference between
subjects comes from the gender effect -- estimates zero variance for the
random effect here. I am only asking about cases where this variance is
zero!
To my way of thinking, the observations are grouped under ten subjects,
five male and five female. So a reasonable p-value would come from a t-test
of the two groups of five subject scores, viz.:
scores <- xtabs(~outcome+subject,dataset)[2,]/n
male.scores <- scores[1:5]
female.scores <- scores[6:10]
t.test(male.scores,female.scores)
When I run these two, I get results like the following:
lmer(): 1.950e-06
t-test: 1.688e-05
lmer(): 2.042e-07
t-test: 4.606e-05
lmer(): 0.0001934
t-test: 0.004178
lmer(): 0.0001447
t-test: 0.001961
lmer(): 9.168e-07
t-test: 7.807e-07
As we can see, the anova() p-value on the lmer() models is usually, but not
always, anti-conservative with respect to the t-test, usually by between 1
and 2 orders of magnitude.
Can someone please explain why I'm not getting closer agreement between
these two numbers? It seems that both approaches are asking the same
question - what is the significance of the gender effect in the data?
In both approaches, it's the only effect (since subject variance is zero)
and both approaches take into account the non-independence/grouping
structure of the data, but in different ways - the t-test by working with
subject average scores, and the lmer() by...
Am I misunderstanding something here?
Thanks very much,
Daniel
More information about the R-help
mailing list