[R] How make a x,y dataset from a formula based entry
Gabor Grothendieck
ggrothendieck at gmail.com
Fri Sep 23 15:32:03 CEST 2011
On Thu, Sep 22, 2011 at 2:54 PM, trekvana <trekvana at aol.com> wrote:
> Hello all,
>
> So I am using the (formula entry) method for randomForests:
>
> randomForest(y~x1+x2+...+x39+x40,data=xxx,...) but the issue is that some of
> the items in that package dont take a formula entry - you have to explicitly
> state the y and x vector:
>
> randomForest(x=xxx[,c('x1','x2',...,'x40')],y=xxx[,'y'],...)
>
> Now my question is whether there is a function/way to tell R to take a
> formula and make the two corresponding datasets [x,y] (that way I dont have
> to create the x dataset manually with all 40 variables I have).
>
> There must be a more elegant way to do this than
> x=xxx[,c('x1','x2',...,'x40')]
We assume that the formula is of the form:
fo <- y ~ x1 + x2 + x3
Now if we set:
v <- all.vars(fo)
and if DF is our data frame then DF[, v[1]] and DF[v[-1]] are the
response and predictors. (You may need to add an intercept to the
predictors and convert the predictors from data frame to a matrix
depending on what you intend to do next.)
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list