[R] confused on model.frame evaluation

Erik Iverson eriki at ccbr.umn.edu
Fri Apr 30 23:52:13 CEST 2010


Hello!

I'm reading through a logistic regression book and using R to replicate 
the results.  Although my question is not directly related to this, it's 
the context I discovered it in, so here we go.

Consider these data:

interco <- structure(list(white = c(1, 1, 0, 0), male = c(1, 0, 1, 0), 
yes = c(43, 26, 29, 22), no = c(134, 149, 23, 36), total = c(177, 175, 
52, 58)), .Names = c("white", "male", "yes", "no", "total"), row.names = 
c(NA, -4L), class = "data.frame")

We can use logistic regression to analyze this table, using glm's syntax 
  for successes/failures described on the top of page 191 in MASS 4th 
edition.

summary(glm(as.matrix(interco[c("yes", "no")]) ~ white + male,
             data = interco, family = binomial))


The output prints out, no problem!

Now, another data set, note the identifying feature of this one is that 
it contains a column with the same name as the object (i.e., "working")

working <- structure(list(france = c(1, 1, 1, 1, 0, 0, 0, 0), manual = 
c(1, 1, 0, 0, 1, 1, 0, 0), famanual = c(1, 0, 1, 0, 1, 0, 1, 0), total = 
c(107, 65, 66, 171, 87, 65, 85, 148), working = c(85, 44, 24, 17, 24,
22, 1, 6), no = c(22, 21, 42, 154, 63, 43, 84, 142)), .Names = 
c("france", "manual", "famanual", "total", "working", "no"), row.names = 
c(NA, -8L), class = "data.frame")

summary(glm(as.matrix(working[c("working", "no")]) ~ france + manual + 
famanual, data = working, family = binomial))

Error in model.frame.default(formula = as.matrix(working[c("working",  :
   variable lengths differ (found for 'france')

Well, this error goes away simply by renaming the "working" variable in 
the data.frame "working" to something else.  I found the "eval" line in 
model.frame that's throwing the error, but I'm still confused as to why.

I'm sure it's not a bug, but could someone point to a thread or offer 
some gentle advice on what's happening?  I think it's related to:

test <- data.frame(name1 = 1:5, name2 = 6:10, test = 11:15)
eval(expression(test[c("name1", "name2")]))
eval(expression(interco[c("name1", "test")]))


Thanks!

--Erik



More information about the R-help mailing list