[R] eval(parse(text vs. get when accessing a function

Ramon Diaz-Uriarte rdiaz02 at gmail.com
Sat Jan 6 15:16:51 CET 2007


Dear Greg,


On 1/5/07, Greg Snow <Greg.Snow at intermountainmail.org> wrote:
> Ramon,
>
> I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us).
>


Those suggestions do apply to me of course (no claim to being
organized nor beyond idiocy here). And actually the suggestions on
this thread are being very useful. I think, though, that I was not
very clear on the context and my examples were too dumbed down. So
I'll try to give more detail (nothing here is secret, I am just trying
not to bore people).

The code is part of a web-based application, so there is no
interactive user. The R code is passed the arguments (and optional
user functions) from the CGI.

There is one "core" function (call it cvFunct) that, among other
things, does cross-validation. So this is one way to do things:

cvFunct <- function(whatever, genefiltertype, whateverelse) {
      internalGeneSelect <- eval(parse(text = paste("geneSelect",
                                             genefiltertype, sep = ".")))

      ## do things calling internalGeneSelect,
}

and now define all possible functions as

geneSelect.Fratio <- function(x, y, z) {##something}
geneSelect.Wilcoxon <- function(x, y, z) {## something else}

If I want more geneSelect functions, adding them is simple. And I can
even allow the user to pass her/his own functions, with the only
restriction that it takes three args, x, y, z, and that the function
is to be called: "geneSelect." and a user choosen string. (Yes, I need
to make sure no calls to "system", etc, are in the user code, etc,
etc, but that is another issue).

The general idea is not new of course. For instance, in package
"e1071", a somewhat similar thing is done in function "tune", and
David Meyer there uses "do.call". However, tune is a lot more general
than what I had in mind. For instance, "tune" deals with arbitrary
functions, with arbitrary numbers and names of parameters, whereas my
functions above all take only three arguments (x: a matrix, y: a
vector; z: an integer), so the neat functionality provided by
"do.call", and passing the args as a list is not really needed.

So, given that my situation is so structured, and I do not need
"do.call", I think the approach via eval(parse(paste makes my life
simple:

a) the central function (cvFunct) uses something I can easily
recognize: "internalGeneSelect"

b) after the initial eval(parse(text I do not need to worry anymore
about what the "true" gene selection function is called

c) adding new functions and calling them is simple: function naming
follows a simple pattern ("geneSelect." + postfix) and calling the
user function only requires passing the postfix to cvFunct.

d) notice also that, at least the functs. I define, will of course not
be named "f.1", etc, but rather things like "geneSelect.Fratio" or
"geneSelect.namesThatStartWithCuteLetters";

I hope this makes things more clear. I did not include this detail
because this is probably boring (I guess most of you have stopped
reading by now :-).


> Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea.  Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them.
>
> With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what.
>

But I don't see how having your functions as list elements is easier
(specially if the function is longer than 2 to 3 lines) than having
all functions systematically named things such as:

geneSelect.Fratio
geneSelect.Random
geneSelect.LetterA
etc

Of course, I could have a list with the components named "Fratio"
"Random", "LetterA". But I fail to see what it adds. And it forces me
to build the list, and probably rebuild it whe (or not build it until)
the user enters her/his own selection function. But the later I do not
need to do with the scheme above.


> With your function, what if the user runs:
>
> > g(5,3)
>
> What should it do?  (you have only shown definitions for f.1 and f.2).  With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now.  If I use the list approach then I will get a subscript out of bounds error rather than running something unintended.
>
>

I see the general concern, but not how it applies here. If I pass
argument "Fratio" then either I use geneSelect.Fratio or I get an
error if "geneSelect.Fratio" does not exist. Similar to what would
happen if I do

g1(2, 8)

when f.8 is not defined:

Error in eval(expr, envir, enclos) : object "f.8" not found
So even in more general cases, except for function redefinitions, etc,
you are not able to call non-existent stuff.

> 2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again.  I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident.
>
>

Yes, that is true. Again, it does not apply to the actual case I have
in mind, but of course, without the detailed info on context I just
gave, you could not know that.


> 3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved.
>

Oh, sure. But all the functions above live in a single file (actually,
a minipackage) except for the optional use function (which is read
from a file).


>
> Personally I have never regretted trying not to underestimate my own future stupidity.
>

Neither do I. And actually, that is why I asked: if Thomas Lumley
said, in the fortune, that I better rethink about it, then I should
try rethinking about it. But I asked because I failed to see what the
problem is.


> Hope this helps,
>

It certainly does.


Best,

R.


> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.snow at intermountainmail.org
> (801) 408-8111
>
>
>
> > -----Original Message-----
> > From: r-help-bounces at stat.math.ethz.ch
> > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Ramon
> > Diaz-Uriarte
> > Sent: Friday, January 05, 2007 11:41 AM
> > To: Peter Dalgaard
> > Cc: r-help; rdiaz02 at gmail.com
> > Subject: Re: [R] eval(parse(text vs. get when accessing a function
> >
> > On Friday 05 January 2007 19:21, Peter Dalgaard wrote:
> > > Ramon Diaz-Uriarte wrote:
> > > > Dear All,
> > > >
> > > > I've read Thomas Lumley's fortune "If the answer is parse() you
> > > > should usually rethink the question.". But I am not sure it that
> > > > also applies (and why) to other situations (Lumley's comment
> > > > http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html
> > > > was in reply to accessing a list).
> > > >
> > > > Suppose I have similarly called functions, except for a
> > postfix. E.g.
> > > >
> > > > f.1 <- function(x) {x + 1}
> > > > f.2 <- function(x) {x + 2}
> > > >
> > > > And sometimes I want to call f.1 and some other times f.2 inside
> > > > another function. I can either do:
> > > >
> > > > g <- function(x, fpost) {
> > > >     calledf <- eval(parse(text = paste("f.", fpost, sep = "")))
> > > >     calledf(x)
> > > >     ## do more stuff
> > > > }
> > > >
> > > >
> > > > Or:
> > > >
> > > > h <- function(x, fpost) {
> > > >     calledf <- get(paste("f.", fpost, sep = ""))
> > > >     calledf(x)
> > > >     ## do more stuff
> > > > }
> > > >
> > > >
> > > > Two questions:
> > > > 1) Why is the second better?
> > > >
> > > > 2) By changing g or h I could use "do.call" instead; why
> > would that
> > > > be better? Because I can handle differences in argument lists?
> >
> > Dear Peter,
> >
> > Thanks for your answer.
> >
> > >
> > > Who says that they are better?  If the question is how to call a
> > > function specified by half of its name, the answer could well be to
> > > use parse(), the point is that you should rethink whether that was
> > > really the right question.
> > >
> > > Why not instead, e.g.
> > >
> > > f <- list("1"=function(x) {x + 1} , "2"=function(x) {x + 2}) h <-
> > > function(x, fpost) f[[fpost]](x)
> > >
> > > > h(2,"2")
> > >
> > > [1] 4
> > >
> > > > h(2,"1")
> > >
> > > [1] 3
> > >
> >
> > I see, this is direct way of dealing with the problem.
> > However, you first need to build the f list, and you might
> > not know about that ahead of time. For instance, if I build a
> > function so that the only thing that you need to do to use my
> > function g is to call your function "f.something", and then
> > pass the "something".
> >
> > I am still under the impression that, given your answer,
> > using "eval(parse(text" is not your preferred way.  What are
> > the possible problems (if there are any, that is). I guess I
> > am puzzled by "rethink whether that was really the right question".
> >
> >
> > Thanks,
> >
> > R.
> >
> >
> >
> >
> >
> >
> >
> > > > Thanks,
> > > >
> > > >
> > > > R.
> >
> > --
> > Ramón Díaz-Uriarte
> > Centro Nacional de Investigaciones Oncológicas (CNIO)
> > (Spanish National Cancer Center) Melchor Fernández Almagro, 3
> > 28029 Madrid (Spain)
> > Fax: +-34-91-224-6972
> > Phone: +-34-91-224-6900
> >
> > http://ligarto.org/rdiaz
> > PGP KeyID: 0xE89B3462
> > (http://ligarto.org/rdiaz/0xE89B3462.asc)
> >
> >
> >
> > **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en
> > s...{{dropped}}
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>


-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz



More information about the R-help mailing list