[Rd] problem using model.frame()
Gabor Grothendieck
ggrothendieck at gmail.com
Tue Aug 16 18:35:36 CEST 2005
On 8/16/05, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:
> On Tue, 2005-08-16 at 11:25 -0400, Gabor Grothendieck wrote:
> > It can handle data frames like this:
> >
> > model.frame(y1)
> > or
> > model.frame(~., y1)
>
> Thanks Gabor,
>
> Yes, I know that works, but I want the function coca.formula to accept a
> formula like this y2 ~ y1, with both y1 and y2 being data frames. It is
The expressions I gave work generally (i.e. lm, glm, ...), not just in
model.matrix, so would it be ok if the user just does this?
yourfunction(y2 ~., y1)
If it really is important to do it the way you describe, are the data
frames necessarily numeric? If so you could preprocess your formula
by placing as.matrix around all the variables representing data frames
using something like this:
https://www.stat.math.ethz.ch/pipermail/r-help/2004-December/061485.html
Of course, if they are necessarily numeric maybe they can be matrices in
the first place?
> more intuitive, to my mind at least for this particular example and
> analysis, to specify the formula with a data frame on the rhs.
>
> model.frame doesn't work with the formula "~ y1" if the object y1, in
> the environment when model.frame evaluates the formula, is a data.frame.
> It works if y1 is a matrix, however. I'd like to work around this
> problem, say by creating an environment in which y1 is modified to be a
> matrix, if possible. Can this be done?
>
> At the moment I have something working by grabbing the bits of the
> formula and then using get() to grab the named object. Of course, this
> won't work if someone wants to use R's formula interface with the
> following formula y2 ~ var1 + var2 + var3, data = y1, or to use the
> subset argument common to many formula implementations. I'd like to have
> the function work in as general a manner as possible, so I'm fishing
> around for potential solutions.
>
> All the best,
>
> Gav
>
> >
> > On 8/16/05, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:
> > > Hi I'm having a problem with model.frame, encapsulated in this example:
> > >
> > > y1 <- matrix(c(3,1,0,1,0,1,1,0,0,0,1,0,0,0,1,1,0,1,1,1),
> > > nrow = 5, byrow = TRUE)
> > > y1 <- as.data.frame(y1)
> > > rownames(y1) <- paste("site", 1:5, sep = "")
> > > colnames(y1) <- paste("spp", 1:4, sep = "")
> > > y1
> > >
> > > model.frame(~ y1)
> > > Error in model.frame(formula, rownames, variables, varnames, extras, extranames, :
> > > invalid variable type
> > >
> > > temp <- as.matrix(y1)
> > > model.frame(~ temp)
> > > temp.spp1 temp.spp2 temp.spp3 temp.spp4
> > > 1 3 1 0 1
> > > 2 0 1 1 0
> > > 3 0 0 1 0
> > > 4 0 0 1 1
> > > 5 0 1 1 1
> > >
> > > Ideally the above wouldn't have names like temp.var1, temp.var2, but one
> > > could deal with that later.
> > >
> > > I have tracked down the source of the error message to line 1330 in
> > > model.c - here I'm stumped as I don't know any C, but it looks as if the
> > > code is looping over the variables in the formula and checking of they
> > > are the right "type". So a matrix of variables gets through, but a
> > > data.frame doesn't.
> > >
> > > It would be good if model.frame could cope with data.frames in formulae,
> > > but seeing as I am incapable of providing a patch, is there a way around
> > > this problem?
> > >
> > > Below is the head of the function I am currently using, including the
> > > function for parsing the formula - borrowed and hacked from
> > > ordiParseFormula() in package vegan.
> > >
> > > I can work out the class of the rhs of the forumla. Is there a way to
> > > create a suitable environment for the data argument of parseFormula()
> > > such that it contains the rhs dataframe coerced to a matrix, which then
> > > should get through model.frame.default without error? How would I go
> > > about manipulating/creating such an environment? Any other ideas?
> > >
> > > Thanks in advance
> > >
> > > Gav
> > >
> > > coca.formula <- function(formula, method = c("predictive", "symmetric"),
> > > reg.method = c("simpls", "eigen"), weights = NULL,
> > > n.axes = NULL, symmetric = FALSE, data)
> > > {
> > > parseFormula <- function (formula, data)
> > > {
> > > browser()
> > > Terms <- terms(formula, "Condition", data = data)
> > > flapart <- fla <- formula <- formula(Terms, width.cutoff = 500)
> > > specdata <- formula[[2]]
> > > X <- eval(specdata, data, parent.frame())
> > > X <- as.matrix(X)
> > > formula[[2]] <- NULL
> > > if (formula[[2]] == "1" || formula[[2]] == "0")
> > > Y <- NULL
> > > else {
> > > mf <- model.frame(formula, data, na.action = na.fail)
> > > Y <- model.matrix(formula, mf)
> > > if (any(colnames(Y) == "(Intercept)")) {
> > > xint <- which(colnames(Y) == "(Intercept)")
> > > Y <- Y[, -xint, drop = FALSE]
> > > }
> > > }
> > > list(X = X, Y = Y)
> > > }
> > > if (missing(data))
> > > data <- parent.frame()
> > > #browser()
> > > dat <- parseFormula(formula, data)
> > >
> > > --
> > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> > > Gavin Simpson [T] +44 (0)20 7679 5522
> > > ENSIS Research Fellow [F] +44 (0)20 7679 7565
> > > ENSIS Ltd. & ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
> > > UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/
> > > 26 Bedford Way [W] http://www.ucl.ac.uk/~ucfagls/
> > > London. WC1H 0AP.
> > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> > >
> > > ______________________________________________
> > > R-devel at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Gavin Simpson [T] +44 (0)20 7679 5522
> ENSIS Research Fellow [F] +44 (0)20 7679 7565
> ENSIS Ltd. & ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
> UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/
> 26 Bedford Way [W] http://www.ucl.ac.uk/~ucfagls/
> London. WC1H 0AP.
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>
>
More information about the R-devel
mailing list