[Rd] formulas and frames

Charles Geyer charlie at stat.umn.edu
Sat Apr 2 22:29:06 CEST 2005

On Fri, Apr 01, 2005 at 09:56:01AM -0500, Gabor Grothendieck wrote:
> Try this:
> my.df <- data.frame(a=1:10, b=11:20, c=21:30, d=31:40)
> > model.response(model.frame(cbind(a,b) ~ c+d, my.df))
>     a  b
> 1   1 11
> 2   2 12
> 3   3 13
> 4   4 14
> 5   5 15
> 6   6 16
> 7   7 17
> 8   8 18
> 9   9 19
> 10 10 20

Well I learned something.  I didn't know that you could have a multivariate
response, but that doesn't actually address my problem.  I also have
some other variables, which I call "predecessor" variables, that also
need to go in the data frame.  The problem is basically that the R
formula language is just too limiting (unless you are of the "all statistics
is regression" school, which I am not).  In this application, I am just over
the border.  I have "response" variables, "predecessor" variables, and
"predictor" variables (all need to be vectors of the same length or matrices
with the appropriate row dimension, just the usual requirement for data

I want the user to be able to use the formula language to connect the
"predictor" variables to the linear predictor parameter in the usual
way.  But in order to get any calculations done, I need to get these
other variables -- including the "predecessor" variables, which have
no place in the R formula language (!!) -- into a data frame (if I am
going to use the reshape function on the data).

Moreover, it is not the "S way" to force the user to construct this
data frame herself.  The variables in the formula (and out of the formula)
can just be anywhere, and R is supposed to "do the right thing".

So I'm still looking for

> > ... a function that
> > just stuffs all that stuff into a data frame (model.frame would do it
> > if I didn't have this extra stuff).

Charles Geyer
Professor, School of Statistics
University of Minnesota
charlie at stat.umn.edu

More information about the R-devel mailing list