[Rd] model.matrix.default chokes on backquote (PR#7202)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sat Aug 28 16:47:52 CEST 2004


"Gabor Grothendieck" <ggrothendieck at myway.com> writes:

> > ggrothendieck at myway.com writes:
> > 
> > > The following gives an error:
> > >
> > >      > `a(b)` <- 1:4
> > >      > `c(d)` <- (1:4)^2
> > >      > lm(`a(b)` ~ `c(d)`)
> > >      Error in model.matrix.default(mt, mf, contrasts) :
> > >      model frame and formula mismatch in model.matrix()
> > >
> > > To fix it replace this line in model.matrix.default:
> > >
> > >      reorder <- match(attr(t, "variables")[-1], names(data))
> > >
> > > with these two lines:
> > >
> > >      strip.backquote <- function(x) gsub("^`(.*)`", "\\1", x)
> > >      reorder <- match(strip.backquote(attr(t, "variables"))[-1],
> > >                strip.backquote(names(data)))
> > 
> > Hmm.. Yes, there's a bug (and it's likely not the only one we have
> > relating to odd variable names in model formulas), but I suspect that
> > the fix is wrong.
> > 
> > The backquotes are not part of the variable names, but get added by
> > deparsing -- sometimes! Other times they do not: Try for instance
> > as.character(quote(`a(b)`)). (Which is as it should be. Other pieces
> > of logic relating to nonsyntactical names represent some rather
> > awkward compromises.)
> > 
> > When backquotes have found their way into names(data) or the
> > "variables" attribute, I would rather suspect that they were created
> > by the wrong tool and fix that, not cure the symptom by stripping them
> > off at a later stage.
> 
> In model.frame.default there is a line:
> 
>    varnames <- as.character(vars[-1])
> 
> that turns part of a call object, vars, into a character string.  
> We could change that to:
> 
>    varnames <- strip.backquote(as.character(as.list(vars[-1])))
> 
> or perhaps as.character should not return the backquotes in the
> first place in which case the fix would be to fix as.character.

Or not use it in this way. I forget what the reasoning was behind the
current behaviour of as.character, but the point is that 

> as.character(attr(terms(`a(b)`~`c(d)`),"variables"))
[1] "list"   "`a(b)`" "`c(d)`"

whereas for instance

> sapply(attr(terms(`a(b)`~`c(d)`),"variables")[-1],as.character)
[1] "a(b)" "c(d)"

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907



More information about the R-devel mailing list