[R] Passing the name of a variable to a function

Douglas Bates bates at stat.wisc.edu
Thu Aug 5 00:07:24 CEST 2010


On Wed, Aug 4, 2010 at 2:09 PM, Erik Iverson <eriki at ccbr.umn.edu> wrote:
> Hello,
>
>
>> I have a problem which has bitten me occasionally. I often need to
>> prepare graphs for many variables in a data set, but seldom for all.
>> or for any large number of sequential or sequentially named variables.
>> Often I need several graphs for different subsets of the dataset
>> for a given variable.  I run into similar problems with other needs
>> besides graphing.
>>
>>  What I would like to do is something like "write a function which
>> takes the *name* of a variable, presumably a s a character string,
>> from a dataframe, as one argument, and the dataframe, as a second
>> argument".
>>
>> For example, where y is to be the the name of a variable in a given
>> dataframe d, and the other variables needed, T, M and so on, are
>> to be found in the same dataframe :-
>>
>> pf <- function (y,data,...) {
>> p1 <- xyplot(y~x|T,data)
>> p2 <- xyplot(y~x|T,subset(data,M == 2))
>> p3 <- xyplot(y~x|T,subset(data,M == 4))
>> #print(p1,p2,p3....)
>> }
>>  pf(Score1,data)
>> pf(Score2,data)
>>
>>
>> This fails, because, of course, Score 1, Score 2 etc.. are  not
>> defined, or if you pass them as pf(data$Score2,data), then when you
>> subset the
>> data, data$Score2 is now the wrong shape.  I've come up with various
>> inelegant hacks, (often with for loops), for getting around this over
>> the
>> last few years, but I can't help feeling that I'm missing something
>> obvious, which I've been too dim to spot.
>
> Depending on your needs (e.g., you use formulas, which can be trickier),
> I think I often do something like:
>
> # I prefer this, I quote the variable name...
>
> df1 <- data.frame(x = rnorm(100),
>                  score1 = rnorm(100),
>                  M = sample(c(2, 4), 100, replace = TRUE))
>
> pf <- function (y,data,...) {
>  data$y <- data[[y]]
>  xyplot(y~x, subset(data, M == 2))
> }
>
> pf("score1", df1)
>
> # as an alternative, use eval/substitute, don't have to quote
>
> pf2 <- function (y,data,...) {
>  data$y <- eval(substitute(y), data)
>  xyplot(y~x, subset(data, M == 2))
> }

That's okay until you get name collisions with y in the data frame.  I
would approach the problem by substituting into the formula and
perhaps changing the name y to a hidden name like .y (The general rule
is that a programmer can intentionally use a name starting with . and
expect that it will not conflict with the names chosen by users.
Hostile users who use variable names that start with . get what they
deserve.)

> eval(substitute(.y ~ x, list(.y = as.name("score1"))))
score1 ~ x
> str(eval(substitute(.y ~ x, list(.y = as.name("score1")))))
Class 'formula' length 3 score1 ~ x
  ..- attr(*, ".Environment")=<environment: R_GlobalEnv>

> pf2(score1, df1)
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list