[R] eval(parse(text vs. get when accessing a function
Martin Morgan
mtmorgan at fhcrc.org
Sat Jan 6 17:51:23 CET 2007
Hi Ramon,
It seems like a naming convention (f.xxx) and eval(parse(...)) are
standing in for objects (of class 'GeneSelector', say, representing a
function with a particular form and doing a particular operation) and
dispatch (a function 'geneConverter' might handle a converter of class
'GeneSelector' one way, user supplied ad-hoc functions more carefully;
inside geneConverter the only real concern is that the converter
argument is in fact a callable function).
eval(parse(...)) brings scoping rules to the fore as an explicit
programming concern; here scope is implicit, but that's probably better
-- R will get its own rules right.
Martin
Here's an S4 sketch:
setClass("GeneSelector",
contains="function",
representation=representation(description="character"),
validity=function(object) {
msg <- NULL
argNames <- names(formals(object))
if (argNames[1]!="x")
msg <- c(msg, "\n GeneSelector requires a first argument named 'x'")
if (!"..." %in% argNames)
msg <- c(msg, "\n GeneSelector requires '...' in its signature")
if (0==length(object at description))
msg <- c(msg, "\n Please describe your GeneSelector")
if (is.null(msg)) TRUE else msg
})
setGeneric("geneConverter",
function(converter, x, ...) standardGeneric("geneConverter"),
signature=c("converter"))
setMethod("geneConverter",
signature(converter="GeneSelector"),
function(converter, x, ...) {
## important stuff here
converter(x, ...)
})
setMethod("geneConverter",
signature(converter="function"),
function(converter, x, ...) {
message("ad-hoc converter; hope it works!")
converter(x, ...)
})
and then...
> c1 <- new("GeneSelector",
+ function(x, ...) prod(x, ...),
+ description="Product of x")
>
> c2 <- new("GeneSelector",
+ function(x, ...) sum(x, ...),
+ description="Sum of x")
>
> geneConverter(c1, 1:4)
[1] 24
> geneConverter(c2, 1:4)
[1] 10
> geneConverter(mean, 1:4)
ad-hoc converter; hope it works!
[1] 2.5
>
> cvterr <- new("GeneSelector", function(y) {})
Error in validObject(.Object) : invalid class "GeneSelector" object: 1:
GeneSelector requires a first argument named 'x'
invalid class "GeneSelector" object: 2:
GeneSelector requires '...' in its signature
invalid class "GeneSelector" object: 3:
Please describe your GeneSelector
> xxx <- 10
> geneConverter(xxx, 1:4)
Error in function (classes, fdef, mtable) :
unable to find an inherited method for function "geneConverter", for signature "numeric"
"Ramon Diaz-Uriarte" <rdiaz02 at gmail.com> writes:
> Dear Greg,
>
>
> On 1/5/07, Greg Snow <Greg.Snow at intermountainmail.org> wrote:
>> Ramon,
>>
>> I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us).
>>
>
>
> Those suggestions do apply to me of course (no claim to being
> organized nor beyond idiocy here). And actually the suggestions on
> this thread are being very useful. I think, though, that I was not
> very clear on the context and my examples were too dumbed down. So
> I'll try to give more detail (nothing here is secret, I am just trying
> not to bore people).
>
> The code is part of a web-based application, so there is no
> interactive user. The R code is passed the arguments (and optional
> user functions) from the CGI.
>
> There is one "core" function (call it cvFunct) that, among other
> things, does cross-validation. So this is one way to do things:
>
> cvFunct <- function(whatever, genefiltertype, whateverelse) {
> internalGeneSelect <- eval(parse(text = paste("geneSelect",
> genefiltertype, sep = ".")))
>
> ## do things calling internalGeneSelect,
> }
>
> and now define all possible functions as
>
> geneSelect.Fratio <- function(x, y, z) {##something}
> geneSelect.Wilcoxon <- function(x, y, z) {## something else}
>
> If I want more geneSelect functions, adding them is simple. And I can
> even allow the user to pass her/his own functions, with the only
> restriction that it takes three args, x, y, z, and that the function
> is to be called: "geneSelect." and a user choosen string. (Yes, I need
> to make sure no calls to "system", etc, are in the user code, etc,
> etc, but that is another issue).
>
> The general idea is not new of course. For instance, in package
> "e1071", a somewhat similar thing is done in function "tune", and
> David Meyer there uses "do.call". However, tune is a lot more general
> than what I had in mind. For instance, "tune" deals with arbitrary
> functions, with arbitrary numbers and names of parameters, whereas my
> functions above all take only three arguments (x: a matrix, y: a
> vector; z: an integer), so the neat functionality provided by
> "do.call", and passing the args as a list is not really needed.
>
> So, given that my situation is so structured, and I do not need
> "do.call", I think the approach via eval(parse(paste makes my life
> simple:
>
> a) the central function (cvFunct) uses something I can easily
> recognize: "internalGeneSelect"
>
> b) after the initial eval(parse(text I do not need to worry anymore
> about what the "true" gene selection function is called
>
> c) adding new functions and calling them is simple: function naming
> follows a simple pattern ("geneSelect." + postfix) and calling the
> user function only requires passing the postfix to cvFunct.
>
> d) notice also that, at least the functs. I define, will of course not
> be named "f.1", etc, but rather things like "geneSelect.Fratio" or
> "geneSelect.namesThatStartWithCuteLetters";
>
> I hope this makes things more clear. I did not include this detail
> because this is probably boring (I guess most of you have stopped
> reading by now :-).
>
>
>> Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea. Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them.
>>
>> With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what.
>>
>
> But I don't see how having your functions as list elements is easier
> (specially if the function is longer than 2 to 3 lines) than having
> all functions systematically named things such as:
>
> geneSelect.Fratio
> geneSelect.Random
> geneSelect.LetterA
> etc
>
> Of course, I could have a list with the components named "Fratio"
> "Random", "LetterA". But I fail to see what it adds. And it forces me
> to build the list, and probably rebuild it whe (or not build it until)
> the user enters her/his own selection function. But the later I do not
> need to do with the scheme above.
>
>
>> With your function, what if the user runs:
>>
>> > g(5,3)
>>
>> What should it do? (you have only shown definitions for f.1 and f.2). With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now. If I use the list approach then I will get a subscript out of bounds error rather than running something unintended.
>>
>>
>
> I see the general concern, but not how it applies here. If I pass
> argument "Fratio" then either I use geneSelect.Fratio or I get an
> error if "geneSelect.Fratio" does not exist. Similar to what would
> happen if I do
>
> g1(2, 8)
>
> when f.8 is not defined:
>
> Error in eval(expr, envir, enclos) : object "f.8" not found
> So even in more general cases, except for function redefinitions, etc,
> you are not able to call non-existent stuff.
>
>> 2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again. I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident.
>>
>>
>
> Yes, that is true. Again, it does not apply to the actual case I have
> in mind, but of course, without the detailed info on context I just
> gave, you could not know that.
>
>
>> 3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved.
>>
>
> Oh, sure. But all the functions above live in a single file (actually,
> a minipackage) except for the optional use function (which is read
> from a file).
>
>
>>
>> Personally I have never regretted trying not to underestimate my own future stupidity.
>>
>
> Neither do I. And actually, that is why I asked: if Thomas Lumley
> said, in the fortune, that I better rethink about it, then I should
> try rethinking about it. But I asked because I failed to see what the
> problem is.
>
>
>> Hope this helps,
>>
>
> It certainly does.
>
>
> Best,
>
> R.
>
>
>> --
>> Gregory (Greg) L. Snow Ph.D.
>> Statistical Data Center
>> Intermountain Healthcare
>> greg.snow at intermountainmail.org
>> (801) 408-8111
>>
>>
>>
>> > -----Original Message-----
>> > From: r-help-bounces at stat.math.ethz.ch
>> > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Ramon
>> > Diaz-Uriarte
>> > Sent: Friday, January 05, 2007 11:41 AM
>> > To: Peter Dalgaard
>> > Cc: r-help; rdiaz02 at gmail.com
>> > Subject: Re: [R] eval(parse(text vs. get when accessing a function
>> >
>> > On Friday 05 January 2007 19:21, Peter Dalgaard wrote:
>> > > Ramon Diaz-Uriarte wrote:
>> > > > Dear All,
>> > > >
>> > > > I've read Thomas Lumley's fortune "If the answer is parse() you
>> > > > should usually rethink the question.". But I am not sure it that
>> > > > also applies (and why) to other situations (Lumley's comment
>> > > > http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html
>> > > > was in reply to accessing a list).
>> > > >
>> > > > Suppose I have similarly called functions, except for a
>> > postfix. E.g.
>> > > >
>> > > > f.1 <- function(x) {x + 1}
>> > > > f.2 <- function(x) {x + 2}
>> > > >
>> > > > And sometimes I want to call f.1 and some other times f.2 inside
>> > > > another function. I can either do:
>> > > >
>> > > > g <- function(x, fpost) {
>> > > > calledf <- eval(parse(text = paste("f.", fpost, sep = "")))
>> > > > calledf(x)
>> > > > ## do more stuff
>> > > > }
>> > > >
>> > > >
>> > > > Or:
>> > > >
>> > > > h <- function(x, fpost) {
>> > > > calledf <- get(paste("f.", fpost, sep = ""))
>> > > > calledf(x)
>> > > > ## do more stuff
>> > > > }
>> > > >
>> > > >
>> > > > Two questions:
>> > > > 1) Why is the second better?
>> > > >
>> > > > 2) By changing g or h I could use "do.call" instead; why
>> > would that
>> > > > be better? Because I can handle differences in argument lists?
>> >
>> > Dear Peter,
>> >
>> > Thanks for your answer.
>> >
>> > >
>> > > Who says that they are better? If the question is how to call a
>> > > function specified by half of its name, the answer could well be to
>> > > use parse(), the point is that you should rethink whether that was
>> > > really the right question.
>> > >
>> > > Why not instead, e.g.
>> > >
>> > > f <- list("1"=function(x) {x + 1} , "2"=function(x) {x + 2}) h <-
>> > > function(x, fpost) f[[fpost]](x)
>> > >
>> > > > h(2,"2")
>> > >
>> > > [1] 4
>> > >
>> > > > h(2,"1")
>> > >
>> > > [1] 3
>> > >
>> >
>> > I see, this is direct way of dealing with the problem.
>> > However, you first need to build the f list, and you might
>> > not know about that ahead of time. For instance, if I build a
>> > function so that the only thing that you need to do to use my
>> > function g is to call your function "f.something", and then
>> > pass the "something".
>> >
>> > I am still under the impression that, given your answer,
>> > using "eval(parse(text" is not your preferred way. What are
>> > the possible problems (if there are any, that is). I guess I
>> > am puzzled by "rethink whether that was really the right question".
>> >
>> >
>> > Thanks,
>> >
>> > R.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > > > Thanks,
>> > > >
>> > > >
>> > > > R.
>> >
>> > --
>> > Ramón Díaz-Uriarte
>> > Centro Nacional de Investigaciones Oncológicas (CNIO)
>> > (Spanish National Cancer Center) Melchor Fernández Almagro, 3
>> > 28029 Madrid (Spain)
>> > Fax: +-34-91-224-6972
>> > Phone: +-34-91-224-6900
>> >
>> > http://ligarto.org/rdiaz
>> > PGP KeyID: 0xE89B3462
>> > (http://ligarto.org/rdiaz/0xE89B3462.asc)
>> >
>> >
>> >
>> > **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en
>> > s...{{dropped}}
>> >
>> > ______________________________________________
>> > R-help at stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>
>
> --
> Ramon Diaz-Uriarte
> Statistical Computing Team
> Structural Biology and Biocomputing Programme
> Spanish National Cancer Centre (CNIO)
> http://ligarto.org/rdiaz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org
More information about the R-help
mailing list