[Rd] question about ... passed to two different functions

Sun Sep 6 23:34:11 CEST 2009

On 06/09/2009 1:50 PM, Charles Geyer wrote:
> I have hit a problem with the design of the mcmc package I can't
> figure out, possibly because I don't really understand the R function
> call mechanism.  The function metrop in the mcmc package has a ... argument
> that it passes to one or two user-supplied functions, which are other
> arguments to metrop.  When the two functions don't have the same arguments,
> this doesn't work.  Here's an example.
> 
>  library(mcmc)
>  library(MASS)
> 
>  set.seed(42)
> 
>  n <- 100
>  rho <- 0.5
>  beta0 <- 0.25
>  beta1 <- 0.5
>  beta2 <- 1
>  beta3 <- 1.5
> 
>  Sigma <- matrix(rho, 3, 3)
>  diag(Sigma) <- 1
>  Sigma <- 0.75 * Sigma
>  Mu <- rep(0, 3)
> 
>  foo <- mvrnorm(n, Mu, Sigma)
> 
>  x1 <- foo[ , 1]
>  x2 <- foo[ , 2]
>  x3 <- foo[ , 3]
> 
>  modmat <- cbind(1, foo)
> 
>  eta <- beta0 + beta1 * x1 + beta2 * x2 + beta3 * x3
>  p <- 1 / (1 + exp(- eta))
>  y <- as.numeric(runif(n) < p)
> 
>  out <- glm(y ~ x1 + x2 + x3, family = binomial())
>  summary(out)
> 
>  ### now we want to do a Bayesian analysis of the model, so we write
>  ### a function that evaluates the log unnormalized density of the
>  ### Markov chain we want to run (log likelihood + log prior)
> 
>  ludfun <- function(beta) {
>      stopifnot(is.numeric(beta))
>      stopifnot(length(beta) == ncol(modmat))
>      eta <- as.numeric(modmat %*% beta)
>      logp <- ifelse(eta < 0, eta - log1p(exp(eta)), - log1p(exp(- eta)))
>      logq <- ifelse(eta < 0, - log1p(exp(eta)), - eta - log1p(exp(- eta)))
>      logl <- sum(logp[y == 1]) + sum(logq[y == 0])
>      val <- logl - sum(beta^2) / 2
>      return(val)
>  }
> 
>  beta.initial <- as.vector(out$coefficients)
> 
>  out <- metrop(ludfun, initial = beta.initial, nbatch = 20,
>      blen = 10, nspac = 5, scale = 0.56789)
> 
>  ### Works fine.  Here are the Monte Carlo estimates of the posterior
>  ### means for each parameter with Monte Carlo standard errors.
> 
>  apply(out$batch, 2, mean)
>  sqrt(apply(out$batch, 2, function(x) initseq(x)$var.con) / out$nbatch)
> 
>  ### Now suppose I want Monte Carlo estimates of some function of
>  ### the parameters other than the identity function.  I write a function
>  ### outfun that does that.  Also suppose I want some extra arguments
>  ### to outfun.  This example is a bit forced, but I hit on natural
>  ### examples with a new function (not yet released) that works like
>  ### metrop but does simulated tempering.
> 
>  outfun <- function(beta, degree) {
>      stopifnot(is.numeric(beta))
>      stopifnot(length(beta) == ncol(modmat))
>      stopifnot(is.numeric(degree))
>      stopifnot(length(degree) == 1)
>      stopifnot(degree == as.integer(degree))
>      stopifnot(length(degree) > 0)
>      result <- NULL
>      for (i in 1:degree)
>          result <- c(result, beta^i)
>      return(result)
>  }
> 
>  out <- metrop(out, outfun = outfun, degree = 2)
> 
>  ### Oops!  Try it and you get
>  ###
>  ### Error in obj(state, ...) : unused argument(s) (degree = 2)
> 
> I don't understand what the problem is (mostly because of ignorance).  Because
> 
>  foo <- function(x, ...) x
>  foo(x = 2, y = 3)
> 
> does work.  The error is happening when ludfun is called, and I assume
> the complaint is that it doesn't have an argument "degree", but then
> why doesn't the simple example just above fail?  So clearly I don't
> understand what's going on.

The difference between foo and ludfun is that foo has a ... argument, 
and ludfun doesn't.  So when ludfun gets called with argument degree=2, 
it doesn't know what to do with it.  foo just puts it into the ... and 
ignores it.

Generally ... is very handy when there's only one function that may have 
optional arguments, but less so when you have two or more sets of 
optional arguments to handle.  You can use list(...) to get the list of 
args and spend a lot of work splitting them up, or you could tell users 
that if they want to pass optional arguments to outfun then ludfun 
should also be prepared to receive them, or you could add a parameter 
"ludcontrol" to be a list of things to pass to ludfun, and "outcontrol" 
to be a list of things to pass to outfun.  Or do away with optional 
parameters completely.

> 
> An obvious solution is to ignore ... and just use global variables, i. e.,
> define degree <- 2 in the global environment and make the signature of
> outfun function(beta).  That does work.  But I don't want to have to
> explain this issue on the help pages if I can actually fix the problem.

That's one solution.  A somewhat better one is to keep the values local, 
but still keep the signature of the function the same.  For example,

outfun <- local({  degree <- 2
                    function(beta) {
                       ...
                    }
                 })

or a function that creates outfun, e.g.

makeOutfun <- function(degree) {
   force(degree)  # make sure it gets evaluated
   function(beta) {  # Now create a function that can see it.
      ...
   }
}

These two solutions are better than a global in that you can have 
multiple outfuns with different degree values.

> 
> I have no idea whether one needs to look at the source code for the
> mcmc package to diagnose the issue.  If one does, it's on CRAN.
> 

I don't think so, but if my advice is way off base, maybe so.

Duncan Murdoch