[R] eval(parse(text vs. get when accessing a function

Ramon Diaz-Uriarte rdiaz02 at gmail.com
Wed Jan 17 22:01:56 CET 2007


(I overlooked the reply).

Thanks, Gabor. That is neat and easy! (and I should have been able to
see it on my own :-(

Best,

R.

On 1/8/07, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> The S4 is not essential.  You could do it in S3 too:
>
> > f.a <- function(x) x+1
> > f.b <- function(x) x+2
> > f <- function(x) UseMethod("f")
> >
> > f(structure(10, class = "a"))
> [1] 11
> attr(,"class")
> [1] "a"
>
> On 1/6/07, Ramon Diaz-Uriarte <rdiaz02 at gmail.com> wrote:
> > Hi Martin,
> >
> >
> >
> > On 1/6/07, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> > > Hi Ramon,
> > >
> > > It seems like a naming convention (f.xxx) and eval(parse(...)) are
> > > standing in for objects (of class 'GeneSelector', say, representing a
> > > function with a particular form and doing a particular operation) and
> > > dispatch (a function 'geneConverter' might handle a converter of class
> > > 'GeneSelector' one way, user supplied ad-hoc functions more carefully;
> > > inside geneConverter the only real concern is that the converter
> > > argument is in fact a callable function).
> > >
> > > eval(parse(...)) brings scoping rules to the fore as an explicit
> > > programming concern; here scope is implicit, but that's probably better
> > > -- R will get its own rules right.
> > >
> > > Martin
> > >
> > > Here's an S4 sketch:
> > >
> > > setClass("GeneSelector",
> > >          contains="function",
> > >          representation=representation(description="character"),
> > >          validity=function(object) {
> > >              msg <- NULL
> > >              argNames <- names(formals(object))
> > >              if (argNames[1]!="x")
> > >                msg <- c(msg, "\n  GeneSelector requires a first argument named 'x'")
> > >              if (!"..." %in% argNames)
> > >                msg <- c(msg, "\n  GeneSelector requires '...' in its signature")
> > >              if (0==length(object at description))
> > >                msg <- c(msg, "\n  Please describe your GeneSelector")
> > >              if (is.null(msg)) TRUE else msg
> > >          })
> > >
> > > setGeneric("geneConverter",
> > >            function(converter, x, ...) standardGeneric("geneConverter"),
> > >            signature=c("converter"))
> > >
> > > setMethod("geneConverter",
> > >           signature(converter="GeneSelector"),
> > >           function(converter, x, ...) {
> > >               ## important stuff here
> > >               converter(x, ...)
> > >           })
> > >
> > > setMethod("geneConverter",
> > >           signature(converter="function"),
> > >           function(converter, x, ...) {
> > >               message("ad-hoc converter; hope it works!")
> > >               converter(x, ...)
> > >           })
> > >
> > > and then...
> > >
> > > > c1 <- new("GeneSelector",
> > > +           function(x, ...) prod(x, ...),
> > > +           description="Product of x")
> > > >
> > > > c2 <- new("GeneSelector",
> > > +           function(x, ...) sum(x, ...),
> > > +           description="Sum of x")
> > > >
> > > > geneConverter(c1, 1:4)
> > > [1] 24
> > > > geneConverter(c2, 1:4)
> > > [1] 10
> > > > geneConverter(mean, 1:4)
> > > ad-hoc converter; hope it works!
> > > [1] 2.5
> > > >
> > > > cvterr <- new("GeneSelector", function(y) {})
> > > Error in validObject(.Object) : invalid class "GeneSelector" object: 1:
> > >   GeneSelector requires a first argument named 'x'
> > > invalid class "GeneSelector" object: 2:
> > >   GeneSelector requires '...' in its signature
> > > invalid class "GeneSelector" object: 3:
> > >   Please describe your GeneSelector
> > > > xxx <- 10
> > > > geneConverter(xxx, 1:4)
> > > Error in function (classes, fdef, mtable)  :
> > >         unable to find an inherited method for function "geneConverter", for signature "numeric"
> > >
> >
> >
> >
> > Thanks!! That is actually a rather interesting alternative approach
> > and I can see it also adds a lot of structure to the problem. I have
> > to confess, though, that I am not a fan of OOP (nor of S4 classes); in
> > this case, in particular, it seems there is a lot of scaffolding in
> > the code above (the counterpoint to the structure?) and, regarding
> > scoping rules, I prefer to think about them explicitly (I find it much
> > simpler than inheritance).
> >
> > Best,
> >
> > R.
> >
> >
> > >
> > > "Ramon Diaz-Uriarte" <rdiaz02 at gmail.com> writes:
> > >
> > > > Dear Greg,
> > > >
> > > >
> > > > On 1/5/07, Greg Snow <Greg.Snow at intermountainmail.org> wrote:
> > > >> Ramon,
> > > >>
> > > >> I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us).
> > > >>
> > > >
> > > >
> > > > Those suggestions do apply to me of course (no claim to being
> > > > organized nor beyond idiocy here). And actually the suggestions on
> > > > this thread are being very useful. I think, though, that I was not
> > > > very clear on the context and my examples were too dumbed down. So
> > > > I'll try to give more detail (nothing here is secret, I am just trying
> > > > not to bore people).
> > > >
> > > > The code is part of a web-based application, so there is no
> > > > interactive user. The R code is passed the arguments (and optional
> > > > user functions) from the CGI.
> > > >
> > > > There is one "core" function (call it cvFunct) that, among other
> > > > things, does cross-validation. So this is one way to do things:
> > > >
> > > > cvFunct <- function(whatever, genefiltertype, whateverelse) {
> > > >       internalGeneSelect <- eval(parse(text = paste("geneSelect",
> > > >                                              genefiltertype, sep = ".")))
> > > >
> > > >       ## do things calling internalGeneSelect,
> > > > }
> > > >
> > > > and now define all possible functions as
> > > >
> > > > geneSelect.Fratio <- function(x, y, z) {##something}
> > > > geneSelect.Wilcoxon <- function(x, y, z) {## something else}
> > > >
> > > > If I want more geneSelect functions, adding them is simple. And I can
> > > > even allow the user to pass her/his own functions, with the only
> > > > restriction that it takes three args, x, y, z, and that the function
> > > > is to be called: "geneSelect." and a user choosen string. (Yes, I need
> > > > to make sure no calls to "system", etc, are in the user code, etc,
> > > > etc, but that is another issue).
> > > >
> > > > The general idea is not new of course. For instance, in package
> > > > "e1071", a somewhat similar thing is done in function "tune", and
> > > > David Meyer there uses "do.call". However, tune is a lot more general
> > > > than what I had in mind. For instance, "tune" deals with arbitrary
> > > > functions, with arbitrary numbers and names of parameters, whereas my
> > > > functions above all take only three arguments (x: a matrix, y: a
> > > > vector; z: an integer), so the neat functionality provided by
> > > > "do.call", and passing the args as a list is not really needed.
> > > >
> > > > So, given that my situation is so structured, and I do not need
> > > > "do.call", I think the approach via eval(parse(paste makes my life
> > > > simple:
> > > >
> > > > a) the central function (cvFunct) uses something I can easily
> > > > recognize: "internalGeneSelect"
> > > >
> > > > b) after the initial eval(parse(text I do not need to worry anymore
> > > > about what the "true" gene selection function is called
> > > >
> > > > c) adding new functions and calling them is simple: function naming
> > > > follows a simple pattern ("geneSelect." + postfix) and calling the
> > > > user function only requires passing the postfix to cvFunct.
> > > >
> > > > d) notice also that, at least the functs. I define, will of course not
> > > > be named "f.1", etc, but rather things like "geneSelect.Fratio" or
> > > > "geneSelect.namesThatStartWithCuteLetters";
> > > >
> > > > I hope this makes things more clear. I did not include this detail
> > > > because this is probably boring (I guess most of you have stopped
> > > > reading by now :-).
> > > >
> > > >
> > > >> Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea.  Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them.
> > > >>
> > > >> With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what.
> > > >>
> > > >
> > > > But I don't see how having your functions as list elements is easier
> > > > (specially if the function is longer than 2 to 3 lines) than having
> > > > all functions systematically named things such as:
> > > >
> > > > geneSelect.Fratio
> > > > geneSelect.Random
> > > > geneSelect.LetterA
> > > > etc
> > > >
> > > > Of course, I could have a list with the components named "Fratio"
> > > > "Random", "LetterA". But I fail to see what it adds. And it forces me
> > > > to build the list, and probably rebuild it whe (or not build it until)
> > > > the user enters her/his own selection function. But the later I do not
> > > > need to do with the scheme above.
> > > >
> > > >
> > > >> With your function, what if the user runs:
> > > >>
> > > >> > g(5,3)
> > > >>
> > > >> What should it do?  (you have only shown definitions for f.1 and f.2).  With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now.  If I use the list approach then I will get a subscript out of bounds error rather than running something unintended.
> > > >>
> > > >>
> > > >
> > > > I see the general concern, but not how it applies here. If I pass
> > > > argument "Fratio" then either I use geneSelect.Fratio or I get an
> > > > error if "geneSelect.Fratio" does not exist. Similar to what would
> > > > happen if I do
> > > >
> > > > g1(2, 8)
> > > >
> > > > when f.8 is not defined:
> > > >
> > > > Error in eval(expr, envir, enclos) : object "f.8" not found
> > > > So even in more general cases, except for function redefinitions, etc,
> > > > you are not able to call non-existent stuff.
> > > >
> > > >> 2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again.  I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident.
> > > >>
> > > >>
> > > >
> > > > Yes, that is true. Again, it does not apply to the actual case I have
> > > > in mind, but of course, without the detailed info on context I just
> > > > gave, you could not know that.
> > > >
> > > >
> > > >> 3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved.
> > > >>
> > > >
> > > > Oh, sure. But all the functions above live in a single file (actually,
> > > > a minipackage) except for the optional use function (which is read
> > > > from a file).
> > > >
> > > >
> > > >>
> > > >> Personally I have never regretted trying not to underestimate my own future stupidity.
> > > >>
> > > >
> > > > Neither do I. And actually, that is why I asked: if Thomas Lumley
> > > > said, in the fortune, that I better rethink about it, then I should
> > > > try rethinking about it. But I asked because I failed to see what the
> > > > problem is.
> > > >
> > > >
> > > >> Hope this helps,
> > > >>
> > > >
> > > > It certainly does.
> > > >
> > > >
> > > > Best,
> > > >
> > > > R.
> > > >
> > > >
> > > >> --
> > > >> Gregory (Greg) L. Snow Ph.D.
> > > >> Statistical Data Center
> > > >> Intermountain Healthcare
> > > >> greg.snow at intermountainmail.org
> > > >> (801) 408-8111
> > > >>
> > > >>
> > > >>
> > > >> > -----Original Message-----
> > > >> > From: r-help-bounces at stat.math.ethz.ch
> > > >> > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Ramon
> > > >> > Diaz-Uriarte
> > > >> > Sent: Friday, January 05, 2007 11:41 AM
> > > >> > To: Peter Dalgaard
> > > >> > Cc: r-help; rdiaz02 at gmail.com
> > > >> > Subject: Re: [R] eval(parse(text vs. get when accessing a function
> > > >> >
> > > >> > On Friday 05 January 2007 19:21, Peter Dalgaard wrote:
> > > >> > > Ramon Diaz-Uriarte wrote:
> > > >> > > > Dear All,
> > > >> > > >
> > > >> > > > I've read Thomas Lumley's fortune "If the answer is parse() you
> > > >> > > > should usually rethink the question.". But I am not sure it that
> > > >> > > > also applies (and why) to other situations (Lumley's comment
> > > >> > > > http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html
> > > >> > > > was in reply to accessing a list).
> > > >> > > >
> > > >> > > > Suppose I have similarly called functions, except for a
> > > >> > postfix. E.g.
> > > >> > > >
> > > >> > > > f.1 <- function(x) {x + 1}
> > > >> > > > f.2 <- function(x) {x + 2}
> > > >> > > >
> > > >> > > > And sometimes I want to call f.1 and some other times f.2 inside
> > > >> > > > another function. I can either do:
> > > >> > > >
> > > >> > > > g <- function(x, fpost) {
> > > >> > > >     calledf <- eval(parse(text = paste("f.", fpost, sep = "")))
> > > >> > > >     calledf(x)
> > > >> > > >     ## do more stuff
> > > >> > > > }
> > > >> > > >
> > > >> > > >
> > > >> > > > Or:
> > > >> > > >
> > > >> > > > h <- function(x, fpost) {
> > > >> > > >     calledf <- get(paste("f.", fpost, sep = ""))
> > > >> > > >     calledf(x)
> > > >> > > >     ## do more stuff
> > > >> > > > }
> > > >> > > >
> > > >> > > >
> > > >> > > > Two questions:
> > > >> > > > 1) Why is the second better?
> > > >> > > >
> > > >> > > > 2) By changing g or h I could use "do.call" instead; why
> > > >> > would that
> > > >> > > > be better? Because I can handle differences in argument lists?
> > > >> >
> > > >> > Dear Peter,
> > > >> >
> > > >> > Thanks for your answer.
> > > >> >
> > > >> > >
> > > >> > > Who says that they are better?  If the question is how to call a
> > > >> > > function specified by half of its name, the answer could well be to
> > > >> > > use parse(), the point is that you should rethink whether that was
> > > >> > > really the right question.
> > > >> > >
> > > >> > > Why not instead, e.g.
> > > >> > >
> > > >> > > f <- list("1"=function(x) {x + 1} , "2"=function(x) {x + 2}) h <-
> > > >> > > function(x, fpost) f[[fpost]](x)
> > > >> > >
> > > >> > > > h(2,"2")
> > > >> > >
> > > >> > > [1] 4
> > > >> > >
> > > >> > > > h(2,"1")
> > > >> > >
> > > >> > > [1] 3
> > > >> > >
> > > >> >
> > > >> > I see, this is direct way of dealing with the problem.
> > > >> > However, you first need to build the f list, and you might
> > > >> > not know about that ahead of time. For instance, if I build a
> > > >> > function so that the only thing that you need to do to use my
> > > >> > function g is to call your function "f.something", and then
> > > >> > pass the "something".
> > > >> >
> > > >> > I am still under the impression that, given your answer,
> > > >> > using "eval(parse(text" is not your preferred way.  What are
> > > >> > the possible problems (if there are any, that is). I guess I
> > > >> > am puzzled by "rethink whether that was really the right question".
> > > >> >
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > R.
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > > > Thanks,
> > > >> > > >
> > > >> > > >
> > > >> > > > R.
> > > >> >
> > > >> > --
> > > >> > Ram�n D�az-Uriarte
> > > >> > Centro Nacional de Investigaciones Oncol�gicas (CNIO)
> > > >> > (Spanish National Cancer Center) Melchor Fern�ndez Almagro, 3
> > > >> > 28029 Madrid (Spain)
> > > >> > Fax: +-34-91-224-6972
> > > >> > Phone: +-34-91-224-6900
> > > >> >
> > > >> > http://ligarto.org/rdiaz
> > > >> > PGP KeyID: 0xE89B3462
> > > >> > (http://ligarto.org/rdiaz/0xE89B3462.asc)
> > > >> >
> > > >> >
> > > >> >
> > > >> > **NOTA DE CONFIDENCIALIDAD** Este correo electr�nico, y en
> > > >> > s...{{dropped}}
> > > >> >
> > > >> > ______________________________________________
> > > >> > R-help at stat.math.ethz.ch mailing list
> > > >> > https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> > PLEASE do read the posting guide
> > > >> > http://www.R-project.org/posting-guide.html
> > > >> > and provide commented, minimal, self-contained, reproducible code.
> > > >> >
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > Ramon Diaz-Uriarte
> > > > Statistical Computing Team
> > > > Structural Biology and Biocomputing Programme
> > > > Spanish National Cancer Centre (CNIO)
> > > > http://ligarto.org/rdiaz
> > > >
> > > > ______________________________________________
> > > > R-help at stat.math.ethz.ch mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > --
> > > Martin T. Morgan
> > > Bioconductor / Computational Biology
> > > http://bioconductor.org
> > >
> >
> >
> > --
> > Ramon Diaz-Uriarte
> > Statistical Computing Team
> > Structural Biology and Biocomputing Programme
> > Spanish National Cancer Centre (CNIO)
> > http://ligarto.org/rdiaz
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz



More information about the R-help mailing list