[R] Function to read a string as the variables as opposed to taking the string name as the variable

Marc Schwartz marc_schwartz at me.com
Thu May 14 21:08:02 CEST 2009


On May 14, 2009, at 12:16 PM, Lori Simpson wrote:

> I am writing a custom function that uses an R-function from the
> reshape package: cast.  However, my question could be applicable to
> any R function.
>
> Normally one writes the arguments directly into a function, e.g.:
>
> result=cast(table1, column1 + column2 + column3   ~    column4,
> mean)      (1)
>
> I need to be able to write this statement as follows:
>
> result=cast(table1, string_with_columns   ~    column4, mean)    (2)
> string_with_columns = group of functions that ultimately outputs:
> "column1 + column2 + column3"
>
> Statement 1 outputs the correct results because I have manually typed
> in the column names I want to use.  However, statement 2 thinks that
> 'string' is the name of a column rather than knowing to paste the
> string in string.
>
>
>
>
> OR
>
>
>
>
> To phrase this problem in a more generic manner, here is an example
> using a simpler function:
>
> first=4
> second=6
> third="first,second"
> max(first,second)  //correctly outputs 6
> max(third)  //outputs "first,second" because it doesn't know to paste
> in the variables first and second, how do I get R to do this?
>
> Any help is appreciated.


Your two examples actually require different approaches.


In the first example, you want to create a character vector and coerce  
it to a 'formula' object, which can be used here and with other  
functions where a formula is one of the arguments (eg. regression  
models).

For example:

string_with_columns <- paste("column", 1:3, sep = "", collapse = " + ")

 > string_with_columns
[1] "column1 + column2 + column3"


form <- paste(string_with_columns, "column4", sep = " ~ ")

 > form
[1] "column1 + column2 + column3 ~ column4"


You would then use something like:

   cast(table1, as.formula(form), mean)



For your second example, you don't need a formula object, you just  
need to 'get' the values in the objects that are named:

first <- 4
second <- 6
third <- "first, second"

vars <- unlist(strsplit(third, split = ", "))

 > vars
[1] "first"  "second"

 > sapply(vars, get)
  first second
      4      6

 > max(sapply(vars, get))
[1] 6

 > sum(sapply(vars, get))
[1] 10


So, see ?as.formula and ?get. Also see ?paste and ?strsplit for  
manipulating the character vectors.

HTH,

Marc Schwartz




More information about the R-help mailing list