[Rd] [R] unvectorized option for outer()

Jonathan Rougier J.C.Rougier at durham.ac.uk
Sun Oct 30 11:48:28 CET 2005


I'm not sure about this.  Perhaps I am a dinosaur, but my feeling is 
that if people are writing functions in R that might be subject to 
simple operations like outer products, then they ought to be writing 
vectorised functions!  Maybe it's not possible to hold this line, and 
maybe "outer" is not the right place to draw it, but I think we ought to 
respect the "x is a vector" mindset as much as possible in the base 
package.  As Tony notes, the documentation does try to be clear about 
what outer actually does, and how it can be used.

So I am not a fan of the VECTORIZED argument, and definitely not a fan 
of the VECTORIZED=FALSE default.

Jonathan.

Gabor Grothendieck wrote:
> If the default were changed to VECTORIZED=FALSE then it would
> still be functionally compatible with what we have now so all existing
> software would continue to run correctly yet would not cause
> problems for the unwary.  Existing software would not have to be changed
> to add VECTORIZED=TRUE except for those, presumbly few, cases
> where outer performance is critical.   One optimization might be to
> have the default be TRUE if the function is * or perhaps if its specified
> as a single character and FALSE otherwise.
> 
> Having used APL I always regarded the original design of outer in R as
> permature performance optimization and this would be a chance to get
> it right.
> 
> On 10/28/05, Tony Plate <tplate at acm.org> wrote:
> 
>>[following on from a thread on R-help, but my post here seems more
>>appropriate to R-devel]
>>
>>Would a patch to make outer() work with non-vectorized functions be
>>considered?  It seems to come up moderately often on the list, which
>>probably indicates that many many people get bitten by the same
>>incorrect expectation, despite the documentation and the FAQ entry.  It
>>looks pretty simple to modify outer() appropriately: one extra function
>>argument and an if-then-else clause to call mapply(FUN, ...) instead of
>>calling FUN directly.
>>
>>Here's a function demonstrating this:
>>
>>outer2 <- function (X, Y, FUN = "*", ..., VECTORIZED=TRUE)
>>{
>>    no.nx <- is.null(nx <- dimnames(X <- as.array(X)))
>>    dX <- dim(X)
>>    no.ny <- is.null(ny <- dimnames(Y <- as.array(Y)))
>>    dY <- dim(Y)
>>    if (is.character(FUN) && FUN == "*") {
>>        robj <- as.vector(X) %*% t(as.vector(Y))
>>        dim(robj) <- c(dX, dY)
>>    }
>>    else {
>>        FUN <- match.fun(FUN)
>>        Y <- rep(Y, rep.int(length(X), length(Y)))
>>        if (length(X) > 0)
>>            X <- rep(X, times = ceiling(length(Y)/length(X)))
>>        if (VECTORIZED)
>>            robj <- FUN(X, Y, ...)
>>        else
>>            robj <- mapply(FUN, X, Y, MoreArgs=list(...))
>>        dim(robj) <- c(dX, dY)
>>    }
>>    if (no.nx)
>>        nx <- vector("list", length(dX))
>>    else if (no.ny)
>>        ny <- vector("list", length(dY))
>>    if (!(no.nx && no.ny))
>>        dimnames(robj) <- c(nx, ny)
>>    robj
>>}
>># Some examples
>>f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
>>outer2(1:2, 3:5, f, 2)
>>outer2(numeric(0), 3:5, f, 2)
>>outer2(1:2, numeric(0), f, 2)
>>outer2(1:2, 3:5, f, 2, VECTORIZED=F)
>>outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
>>outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
>>
>># Output on examples
>>
>>>f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
>>>outer2(1:2, 3:5, f, 2)
>>
>>in f
>>     [,1] [,2] [,3]
>>[1,]    9   16   25
>>[2,]   36   64  100
>>
>>>outer2(numeric(0), 3:5, f, 2)
>>
>>in f
>>     [,1] [,2] [,3]
>>
>>>outer2(1:2, numeric(0), f, 2)
>>
>>in f
>>
>>[1,]
>>[2,]
>>
>>>outer2(1:2, 3:5, f, 2, VECTORIZED=F)
>>
>>in f
>>in f
>>in f
>>in f
>>in f
>>in f
>>     [,1] [,2] [,3]
>>[1,]    9   16   25
>>[2,]   36   64  100
>>
>>>outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
>>
>>     [,1] [,2] [,3]
>>
>>>outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
>>
>>[1,]
>>[2,]
>>
>>If a patch to add this feature would be considered, I'd be happy to
>>submit one (including documentation).  If so, and if there are any
>>potential traps I should bear in mind, please let me know!
>>
>>-- Tony Plate
>>
>>Rau, Roland wrote:
>>
>>>Dear all,
>>>
>>>a big thanks to Thomas Lumley, James Holtman and Tony Plate for their
>>>answers. They all pointed in the same direction => I need a vectorized
>>>function to be applied. Hence, I will try to work with a 'wrapper'
>>>function as described in the FAQ.
>>>
>>>Thanks again,
>>>Roland
>>>
>>>
>>>
>>>
>>>>-----Original Message-----
>>>>From: Thomas Lumley [mailto:tlumley at u.washington.edu]
>>>>Sent: Thursday, October 27, 2005 11:39 PM
>>>>To: Rau, Roland
>>>>Cc: r-help at stat.math.ethz.ch
>>>>Subject: Re: [R] outer-question
>>>>
>>>>
>>>>You want FAQ 7.17 Why does outer() behave strangely with my function?
>>>>
>>>>     -thomas
>>>>
>>>>On Thu, 27 Oct 2005, Rau, Roland wrote:
>>>>
>>>>
>>>>
>>>>>Dear all,
>>>>>
>>>>>This is a rather lengthy message, but I don't know what I
>>>>
>>>>made wrong in
>>>>
>>>>
>>>>>my real example since the simple code works.
>>>>>I have two variables a, b and a function f for which I would like to
>>>>>calculate all possible combinations of the values of a and b.
>>>>>If f is multiplication, I would simply do:
>>>>>
>>>>>a <- 1:5
>>>>>b <- 1:5
>>>>>outer(a,b)
>>>>>
>>>>>## A bit more complicated is this:
>>>>>f <- function(a,b,d) {
>>>>>    return(a*b+(sum(d)))
>>>>>}
>>>>>additional <- runif(100)
>>>>>outer(X=a, Y=b, FUN=f, d=additional)
>>>>>
>>>>>## So far so good. But now my real example. I would like to plot the
>>>>>## log-likelihood surface for two parameters alpha and beta of
>>>>>## a Gompertz distribution with given data
>>>>>
>>>>>### I have a function to generate random-numbers from a
>>>>>Gompertz-Distribution
>>>>>### (using the 'inversion method')
>>>>>
>>>>>random.gomp <- function(n, alpha, beta) {
>>>>>          return( (log(1-(beta/alpha*log(1-runif(n)))))/beta)
>>>>>}
>>>>>
>>>>>## Now I generate some 'lifetimes'
>>>>>no.people <- 1000
>>>>>al <- 0.1
>>>>>bet <- 0.1
>>>>>lifetimes <- random.gomp(n=no.people, alpha=al, beta=bet)
>>>>>
>>>>>### Since I neither have censoring nor truncation in this
>>>>
>>>>simple case,
>>>>
>>>>
>>>>>### the log-likelihood should be simply the sum of the log of the
>>>>>### the densities (following the parametrization of
>>>>
>>>>Klein/Moeschberger
>>>>
>>>>
>>>>>### Survival Analysis, p. 38)
>>>>>
>>>>>loggomp <- function(alphas, betas, timep) {
>>>>>return(sum(log(alphas) + betas*timep + (alphas/betas *
>>>>>(1-exp(betas*timep)))))
>>>>>}
>>>>>
>>>>>### Now I thought I could obtain a matrix of the
>>>>
>>>>log-likelihood surface
>>>>
>>>>
>>>>>### by specifying possible values for alpha and beta with the given
>>>>>data.
>>>>>### I was able to produce this matrix with two for-loops.
>>>>
>>>>But I thought
>>>>
>>>>
>>>>>### I could use also 'outer' in this case.
>>>>>### This is what I tried:
>>>>>
>>>>>possible.alphas <- seq(from=0.05, to=0.15, length=30)
>>>>>possible.betas <- seq(from=0.05, to=0.15, length=30)
>>>>>
>>>>>outer(X=possible.alphas, Y=possible.betas, FUN=loggomp,
>>>>
>>>>timep=lifetimes)
>>>>
>>>>
>>>>>### But the result is:
>>>>>
>>>>>
>>>>>>outer(X=possible.alphas, Y=possible.betas, FUN=loggomp,
>>>>>
>>>>>timep=lifetimes)
>>>>>Error in outer(X = possible.alphas, Y = possible.betas, FUN
>>>>
>>>>= loggomp,
>>>>
>>>>
>>>>>:
>>>>>      dim<- : dims [product 900] do not match the length
>>>>
>>>>of object [1]
>>>>
>>>>
>>>>>In addition: Warning messages:
>>>>>...
>>>>>
>>>>>### Can somebody give me some hint where the problem is?
>>>>>### I checked my definition of 'loggomp' but I thought this
>>>>
>>>>looks fine:
>>>>
>>>>
>>>>>loggomp(alphas=possible.alphas[1], betas=possible.betas[1],
>>>>>timep=lifetimes)
>>>>>loggomp(alphas=possible.alphas[4], betas=possible.betas[10],
>>>>>timep=lifetimes)
>>>>>loggomp(alphas=possible.alphas[3], betas=possible.betas[11],
>>>>>timep=lifetimes)
>>>>>
>>>>>
>>>>>### I'd appreciate any kind of advice.
>>>>>### Thanks a lot in advance.
>>>>>### Roland
>>>>>
>>>>>
>>>>>+++++
>>>>>This mail has been sent through the MPI for Demographic
>>>>
>>>>Rese...{{dropped}}
>>>>
>>>>
>>>>>______________________________________________
>>>>>R-help at stat.math.ethz.ch mailing list
>>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>PLEASE do read the posting guide!
>>>>
>>>>http://www.R-project.org/posting-guide.html
>>>>
>>>>Thomas Lumley                 Assoc. Professor, Biostatistics
>>>>tlumley at u.washington.edu      University of Washington, Seattle
>>>>
>>>
>>>
>>>+++++
>>>This mail has been sent through the MPI for Demographic Research.  Should you receive a mail that is apparently from a MPI user without this text displayed, then the address has most likely been faked. If you are uncertain about the validity of this message, please check the mail header or ask your system administrator for assistance.
>>>
>>>
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
> 
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Jonathan Rougier                       Science Laboratories
Department of Mathematical Sciences    South Road
University of Durham                   Durham DH1 3LE
tel: +44 (0)191 334 3111, fax: +44 (0)191 334 3051
http://www.maths.dur.ac.uk/stats/people/jcr/jcr.html



More information about the R-devel mailing list