[R] lmer for mixed effects modeling of a loglinear model

Sat Apr 30 03:34:10 CEST 2005

I have a dataset with 25 subjects and 25 items. For each subject-item 
combination, there's a 0/1 score for two parts, A and B. I'm thinking 
of this as a set of 2 x 2 tables, 25 x 25 of them. I'd like to fit a 
log-linear model to this data to test the independence of the A and B 
scores.

If I ignore the subject and item parts, the following works just fine:
     glm(count ~ A * B, family = poisson, data =llform.df)

However, there's variability in the success probability by both item 
and subject. There are difficult items (for both parts A and B 
jointly) and there are proficient subjects (for A and B jointly). I 
would like to include these as random effects, and since I don't want 
a separate effect for A and for B, I created a new variable AB = I(A 
+ B), representing the additive effect of A and B. I then tried
     lmer(count ~ A * B + (AB - 1 | item) + (AB - 1 | subj),
          family = poisson, data = llform.df)

The lmer function (from the lme4 package) only detects a portion of 
the variance due to item and subject, and finds a highly significant 
AxB interaction effect that is really only due to the situation that 
good subjects with easy items tend to get both A and B right and so 
forth.

I'm not sure whether I'm modeling this wrong in lmer, I'm thinking 
about this wrong, or I'm asking too much (or any subset of the 
above), but I'd be grateful for any advice. Below I given a few 
sample lines of the dataframe, as well as the code I used to generate 
it.

Thanks,
Steve

Start of dataframe:

      item subj A B AB count
1       1    1 0 0  0 FALSE
1100    1    1 0 1  1 FALSE
1102    1    1 1 0  1 FALSE
1104    1    1 1 1  2  TRUE
2       1    2 0 0  0 FALSE
2100    1    2 0 1  1 FALSE
2102    1    2 1 0  1 FALSE
2104    1    2 1 1  2  TRUE
3       1    3 0 0  0 FALSE
3100    1    3 0 1  1 FALSE
3102    1    3 1 0  1 FALSE
3104    1    3 1 1  2  TRUE

Code to generate it:

test.df <- data.frame(item = gl(25, 25), subj = gl(25, 1, 625))
test.df$item.level <- with(test.df, rnorm(25, sd  = 1)[item])
test.df$subj.level <- with(test.df, rnorm(25, sd = 1)[subj])
test.df$Atemp <- as.numeric(with(test.df, runif(625) < 1/(1 + 
exp(-item.level + subj.level))))
test.df$Btemp <- as.numeric(with(test.df, runif(625) < 1/(1 + 
exp(-item.level + subj.level))))
llform.df <- rbind(test.df, test.df, test.df, test.df)
llform.df$A <- rep(0:1, each = 625 * 2)
llform.df$B <- rep(0:1, each = 625, times = 2)
llform.df$AB <- apply(llform.df[,7:8], 1, sum)
llform.df$count <- with(llform.df, A == Atemp & B == Btemp)
llform.df <- llform.df[order(llform.df$item, llform.df$subj),]

platform powerpc-apple-darwin7.9.0
arch     powerpc
os       darwin7.9.0
system   powerpc, darwin7.9.0
status
major    2
minor    1.0
year     2005
month    04
day      18
language R