[R] Passing variable names in quotes to a function

phgrosjean at sciviews.org phgrosjean at sciviews.org
Wed Dec 2 17:50:08 CET 2015


> On 02 Dec 2015, at 16:09, Brant Inman <brant.inman at me.com> wrote:
> 
> Thank you for your response.  Here is the problem that I find with your code (which I had tried).  When you pass a value to the subset argument of the function, it will not hold the quotes on the subsetting variable’s value.
> 
> For example, if I want the function to do  Y[Z==‘skinny’] so that we use only those values of Y where Z is equal to skinny, I need to be able to retain the quotes around skinny. If you try passing “Z==“skinny”” to the  function, it will remove the quotes and give you Z==skinny, which does not work in the subsetting code.
> 
> 

In this simple version, you must completely specify subset (i.e., data$var == "value"). It differs a little bit from the subset= argument in, say, lm().
An example:

# The lm() way:
lm(Sepal.Length ~ Petal.Length, data = iris, subset = Species == "sets")


# My function
foo <- function(formula, data, subset) {
  if (!missing(subset))
    data <- data[subset, ]
  lm(formula, data = data)
}
foo(Sepal.Length ~ Petal.Length, data = iris, subset = iris$Species == "sets")


Now, if you want the same behaviour as for lm(), it gets a little bit more complicated, and you will have to carefully test your code in various conditions!

foo <- function(formula, data, subset) {
  if (!missing(subset)) {
    rows <- eval(substitute(subset), data)
    data <- data[rows, ]
  }
  lm(formula, data = data)
}

foo(Sepal.Length ~ Petal.Length, data = iris, subset = Species == "setosa")


Philippe
 

> 
> 
>> On Dec 2, 2015, at 7:10 AM, phgrosjean at sciviews.org wrote:
>> 
>> Your example and explanation are not complete, but I have the gut feeling that you could do all this both more efficiently *and* more R-ish.
>> 
>> First of all, why would you pass Y and X separately, to ultimately build the Y ~ X formula within the body of your function?
>> 
>> Secondly, it seems to me that subY and subY.val does something very similar to the subset argument in, say, lm().
>> 
>> Personally, I would write it like this:
>> 
>> foo <- function(formula, data, subset) {
>> if (!missing(subset))
>>   data <- data[subset, ]
>> fit <- some_regression_tool(formula, data = data)
>> 
>> ## <more code>
>> 
>> data_after_processing
>> }
>> 
>> with subset = subY == subY.val.
>> 
>> Best,
>> 
>> Philippe
>> 
>>> On 02 Dec 2015, at 06:11, Brant Inman <brant.inman at me.com> wrote:
>>> 
>>> I am trying to build a function that can accept variables for a regression.  It would work something like this:
>>> 
>>> ---
>>> # Y = my response variable (e.g. income)
>>> # X = my key predictor variable (e.g. education)
>>> # subY = a subsetting variable for Y (e.g. race)
>>> # subY.val = the value of the subsetting value that I want (e.g. ‘black’)
>>> 
>>> foo <- function(Y, X, subY, subY.val, dataset){
>>> 
>>> if(is.na(subY) == F) {
>>>   Y <- paste(Y, ‘[‘, subY, ‘==‘, subY.val, ‘]’)
>>> }
>>> FORMULA <- paste(Y ~ X)
>>> fit <- some.regression.tool(FORMULA, data=dataset)
>>> 
>>> return(some.data.after.processing)
>>> }
>>> ---
>>> 
>>> If I call this function with, foo(income, education, race, “black”, my.dataset), I do not get the result that I need because the FORMULA is "income[race==black] ~ education” when what I need is “income[race==‘black’] ~ education”.  How do I get the quotes to stay on ‘black’?  Or, is there a better way?
>>> 
>>> Help appreciated.
>>> 
>>> --
>>> Brant
>>> 	[[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list