[Rd] problem using model.frame()

Gavin Simpson gavin.simpson at ucl.ac.uk
Tue Aug 16 17:39:29 CEST 2005


On Tue, 2005-08-16 at 11:25 -0400, Gabor Grothendieck wrote:
> It can handle data frames like this:
> 
> 	model.frame(y1)
> or
> 	model.frame(~., y1)

Thanks Gabor,

Yes, I know that works, but I want the function coca.formula to accept a
formula like this y2 ~ y1, with both y1 and y2 being data frames. It is
more intuitive, to my mind at least for this particular example and
analysis, to specify the formula with a data frame on the rhs.

model.frame doesn't work with the formula "~ y1" if the object y1, in
the environment when model.frame evaluates the formula, is a data.frame.
It works if y1 is a matrix, however. I'd like to work around this
problem, say by creating an environment in which y1 is modified to be a
matrix, if possible. Can this be done?

At the moment I have something working by grabbing the bits of the
formula and then using get() to grab the named object. Of course, this
won't work if someone wants to use R's formula interface with the
following formula y2 ~ var1 + var2 + var3, data = y1, or to use the
subset argument common to many formula implementations. I'd like to have
the function work in as general a manner as possible, so I'm fishing
around for potential solutions.

All the best,

Gav 

> 
> On 8/16/05, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:
> > Hi I'm having a problem with model.frame, encapsulated in this example:
> > 
> > y1 <- matrix(c(3,1,0,1,0,1,1,0,0,0,1,0,0,0,1,1,0,1,1,1),
> >             nrow = 5, byrow = TRUE)
> > y1 <- as.data.frame(y1)
> > rownames(y1) <- paste("site", 1:5, sep = "")
> > colnames(y1) <- paste("spp", 1:4, sep = "")
> > y1
> > 
> > model.frame(~ y1)
> > Error in model.frame(formula, rownames, variables, varnames, extras, extranames,  :
> >        invalid variable type
> > 
> > temp <- as.matrix(y1)
> > model.frame(~ temp)
> >  temp.spp1 temp.spp2 temp.spp3 temp.spp4
> > 1         3         1         0         1
> > 2         0         1         1         0
> > 3         0         0         1         0
> > 4         0         0         1         1
> > 5         0         1         1         1
> > 
> > Ideally the above wouldn't have names like temp.var1, temp.var2, but one
> > could deal with that later.
> > 
> > I have tracked down the source of the error message to line 1330 in
> > model.c - here I'm stumped as I don't know any C, but it looks as if the
> > code is looping over the variables in the formula and checking of they
> > are the right "type". So a matrix of variables gets through, but a
> > data.frame doesn't.
> > 
> > It would be good if model.frame could cope with data.frames in formulae,
> > but seeing as I am incapable of providing a patch, is there a way around
> > this problem?
> > 
> > Below is the head of the function I am currently using, including the
> > function for parsing the formula - borrowed and hacked from
> > ordiParseFormula() in package vegan.
> > 
> > I can work out the class of the rhs of the forumla. Is there a way to
> > create a suitable environment for the data argument of parseFormula()
> > such that it contains the rhs dataframe coerced to a matrix, which then
> > should get through model.frame.default without error? How would I go
> > about manipulating/creating such an environment? Any other ideas?
> > 
> > Thanks in advance
> > 
> > Gav
> > 
> > coca.formula <- function(formula, method = c("predictive", "symmetric"),
> >                         reg.method = c("simpls", "eigen"), weights = NULL,
> >                         n.axes = NULL, symmetric = FALSE, data)
> >  {
> >    parseFormula <- function (formula, data)
> >      {
> >        browser()
> >        Terms <- terms(formula, "Condition", data = data)
> >        flapart <- fla <- formula <- formula(Terms, width.cutoff = 500)
> >        specdata <- formula[[2]]
> >        X <- eval(specdata, data, parent.frame())
> >        X <- as.matrix(X)
> >        formula[[2]] <- NULL
> >        if (formula[[2]] == "1" || formula[[2]] == "0")
> >          Y <- NULL
> >        else {
> >          mf <- model.frame(formula, data, na.action = na.fail)
> >          Y <- model.matrix(formula, mf)
> >          if (any(colnames(Y) == "(Intercept)")) {
> >            xint <- which(colnames(Y) == "(Intercept)")
> >            Y <- Y[, -xint, drop = FALSE]
> >          }
> >        }
> >        list(X = X, Y = Y)
> >      }
> >    if (missing(data))
> >      data <- parent.frame()
> >    #browser()
> >    dat <- parseFormula(formula, data)
> > 
> > --
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> > Gavin Simpson                     [T] +44 (0)20 7679 5522
> > ENSIS Research Fellow             [F] +44 (0)20 7679 7565
> > ENSIS Ltd. & ECRC                 [E] gavin.simpsonATNOSPAMucl.ac.uk
> > UCL Department of Geography       [W] http://www.ucl.ac.uk/~ucfagls/cv/
> > 26 Bedford Way                    [W] http://www.ucl.ac.uk/~ucfagls/
> > London.  WC1H 0AP.
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> > 
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson                     [T] +44 (0)20 7679 5522
ENSIS Research Fellow             [F] +44 (0)20 7679 7565
ENSIS Ltd. & ECRC                 [E] gavin.simpsonATNOSPAMucl.ac.uk
UCL Department of Geography       [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way                    [W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-devel mailing list