R-beta: Re: S Compatibility

Luke Tierney luke at stat.umn.edu
Wed Apr 30 17:02:05 CEST 1997


Bill Venables writes:
> 
> Ross Ihaka writes:
>  > Bill Venables writes:
>  > 
>  > > Are the scoping differences between R and S set out precisely and
>  > > definitively somewhere?  This would be useful.
>  > 
>  > In the source code perhaps? :-)
> ...

> Of course this is also possbile in S, but much less elegantly.
> 
> > make.linear.fun <- function(a, b) 
> +	substitute(function(x) .a +.b * x, list(.a = a, .b = b))
> > make.linear.fun(2,3)
> function(x)
> 2 + 3 * x
> > make.linear.fun(2,3) -> funny
> > funny(1)
> [1] 5

substitute works nicely here, but in general it isn't the best choice
since it doesn't understand syntax. I use a little function defined
like this (a good S programmer could probably do this much better):

MC <- function(f, env = NULL)
{
        env <- as.list(env)
        if(mode(f) != "function")
                stop(paste("not a function:", f))
        if(length(env) > 0 && any(names(env) == ""))
                stop(paste("all arguments are not named:", env))
        fargs <- if(length(f) > 1) f[1:(length(f) - 1)] else NULL
        fbody <- f[length(f)]
        cf <- c(fargs, env, fbody)
        mode(cf) <- "function"
        return(cf)
}

MC stands for 'make closure'. This function just adds an environment
list as extra arguments with default values to the function argument
list. Using MC, make.liner.fun is

> make.linear.fun <- function(a, b) MC(function(x) a + b * x,list(a=a,b=b))
> make.linear.fun(2,3)
function(x, a = 2, b = 3)
a + b * x

Adding to the argument list in this way insures that ordinary syntax
determines when these bindings are in effect within the body, so
assignments in the function body, local function definitions, etc work
properly -- they wouldn't if you use substitute and variable names are
reused. This MC function is analogous to functions used to make closures
in older dynamically scoped Lisp dialects like Franz Lisp.  The main
things R closures provide that this does not are encapsulation of
bindings and mutable shared bindings.

At the moment Splus provides very few higher-level functions
(i.e. functions that use function arguments -- there isn't much more
than sweep and apply that I know of -- and users tend not to write
their own, partly because closures are not explicitly available and
therefore aren't something most users think of. Without closures you
have to either design your higher-level function to pass along extra
data a function argument might need, or you have to expect users to go
through some sort of performance of squirreling the data away in a
frame (basically roll-your-own dynamic scoping). Both are error prone
and discourage the use of higher-level functions even though the
language is quite capable of supporting them, at least to a limited
extent. With closures, even the simple ones that you can build with
something like MC, writing functions like apply or any other
higher-level function is much easier.

If you want portable code but want to use closures, as long as you are
not depending on shared mutable bindings you could use the definition
of MC I gave above, or some variant, in S and define MC in R as

	MC<-function(f,env) f

-- you would have to make sure you don't rename variables in the env
argument.

As a historical note, the way apply works goes way back to S Version
1, the one corresponding to the 1984 book. The functions you could use
had to be the kind you would write in C or Fortran (I don't believe
macros could be used) so there was no option of adjusting the function
at run time to contain appropriate data. So the backward compatibility
that is being maintained goes back quite a long way.

As another aside, one of the changes in Java 1.1 that makes it a much
more powerful language is the addition of "inner classes" for creating
local classes and instances in a function at runtime. This is in many
ways comparable to providing closures.

I think one could even make an argument that using closures a bit more
than they are used now in R could eliminate most if not all the need
for using substitute/match/sys.blank in modeling functions to get at
the appropriate data for a formula. This might even allow these
functions and the explicit commitment to making all the details of
frames available at all times to be dropped, at least from code that
is considered debugged. This would lead to incompatibility with S but
perhaps would result in a somewhat smaller and neater core language
design, but that is another story.

luke
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=



More information about the R-help mailing list