[R] SUMMARY: "elementary sapply question"

Ajay Shah ajayshah at mayin.org
Tue Jun 22 12:38:00 CEST 2004


I am grateful to Andy Liaw, Douglas Grove, Brian Ripley, Tony Plate,
Dirk Eddelbuettel and Sundar Dorai-Raj all of whom got together and
drilled sense into my skull. I would like to take some effort into
explaining what the question was, that I was grappling with, and the
(nice) R way of solving the question.

My apologies: I am still a victim of too many years of writing C, so
I'm a bit dense and it takes me a while to comprehend. :-) My code
fragments are not intended to be 'serious code', they are just written
for maximum readability (atleast to my C-damaged brain). Please
forgive me if I'm not yet quite doing idiomatic R. I'm trying to learn
the lingo!

My question:
  When I have a f(x, y) and I do

  sapply(list, f)

  I know that sapply will run over list elems and run f() many
  times. How do I make him iterate over x values as opposed to
  iterating over y values? In:

     sapply(x, f, 3)

  how does sapply know that I mean:

    for (i in 3:5) {
        f(i, 3)
    }

  and not

    for (i in 3:5) {
        f(3, i)
    }

  How would we force sapply to use one or the other interpretation?


Here's what I learned.

Rule: sapply() uses your list to make 'the first arg' to the function.

  When you say
    > sapply(3:5, f)
  he's going to do f(3), f(4) and f(5).

Rule: sapply() allows you to supply extra args which will be passed
  into the function.

  When you say 
    > sapply(3:5, f, z=2)
  he's going to do f(3, z=2), f(4, z=2), f(5, z=2)

Fact: R does 'intelligent guesswork' when it comes to handling
function args. Watch:

  > myfunction <- function(x, y) x*x + y
  > myfunction(10,3)
  [1] 103
        In this case, he placed 10 as x and 3 as y because that's the
        order that they came in.
  > myfunction(y=3, x=10)
  [1] 103
        This works, even though they're in the wrong order, because I
        explicitly said that the 1st is y and the 2nd is x.
  > myfunction(y=3, 10)
  [1] 103
        This is interesting! I only disambiguated y. So he jumped to
        the conclusion that the lonely one was x.

With this in hand, think of how sapply would behave. If you say

  > sapply(3:5, f, 5)

He's going to do f(3, 5), f(4, 5), f(5, 5). In this case, R will infer
that you must mean x=3, y=5, and so on.

But if you say:

  > sapply(3:5, f, x=5)

He's going to do f(3, x=5), f(4, x=5), f(5, x=5). In this case, R will
infer that you mean _y_ takes the value 3 in the first case! When you
say f(3, x=5), R understands that you are doing f(5,3) or
f(x=5,y=3). Through this behaviour, you can use sapply to apply list
elements to any parameter of a function, not just the 1st.

Hence, it's easy to use sapply to work over all elems of a list for
any arg of any function. Faced with a list and f(x,y,z), if you wanted
to iterate the values for z, you would say

  > sapply(list, f, x=value, y=value)

He would repeatedly do things like f(list[i], x=value, y=value), and R
would crack that what you meant was for z to be list[i]. Cool!

To all those who helped me: I'm not sure I was accurately articulating
my question, but you helped me understand what I needed to find
out. Thanks! Hope this posting helps someone else out there.

-- 
Ajay Shah                                                   Consultant
ajayshah at mayin.org                      Department of Economic Affairs
http://www.mayin.org/ajayshah           Ministry of Finance, New Delhi




More information about the R-help mailing list