[R] lapply vs. for (was: Incrementing a counter in lapply)
Patrick Burns
pburns at pburns.seanet.com
Wed Mar 15 23:04:02 CET 2006
In my opinion the main issue between using 'for' and
an apply function is the simplicity of the code. If it is
simpler and more understandable to use 'lapply' than
a 'for' loop in a situation, then use 'lapply'. If in a
different situation it is the 'for' loop that is simpler, then
use the 'for' loop.
In modern day R whatever timing differences there may
be are likely to be slight, and virtually certain not to be
critical.
Where the confusion comes in is because in the olden
days of S-PLUS, the timing differences could be quite
substantial in some cases. The hangover from that is
that apply functions are too often recommended in R.
Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")
Gregor Gorjanc wrote:
>>From: Thomas Lumley
>>
>>
>>>On Tue, 14 Mar 2006, John McHenry wrote:
>>>
>>>
>>>
>>>>Thanks, Gabor & Thomas.
>>>>
>>>>Apologies, but I used an example that obfuscated the question that I
>>>>wanted to ask.
>>>>
>>>>I really wanted to know how to have extra arguments in
>>>>
>>>>
>>>functions that
>>>
>>>
>>>>would allow, per the example code, for something like a
>>>>
>>>>
>>>counter to be
>>>
>>>
>>>>incremented. Thomas's suggestion of using mapply
>>>>
>>>>
>>>(reproduced below with
>>>
>>>
>>>>corrections) is probably closest.
>>>>
>>>>
>>>It is probably worth pointing out here that the R
>>>documentation does not
>>>specify the order in which lapply() does the computation.
>>>
>>>If you could work out how to increment a counter (and you could, with
>>>sufficient effort), it would not necessarily work, because the 'i'th
>>>evaluation would not necessarily be of the 'i'th element.
>>>
>>>[lapply() does in fact start at the beginning, go on until it
>>>gets to the
>>>end, and then stop, but this isn't documented. Suppose R became
>>>multithreaded, for example....]
>>>
>>>
>>The corollary, it seems to me, is that sometimes it's better to leave the
>>good old for loop alone. It's not always profitable to turn for loops into
>>some *apply construct. The trick is learning to know when to do it and when
>>not to.
>>
>>
>
>Can someone share some of this tricks with me? Up to now I have always
>done things with for loop. Just recently I started to pay attention to
>*apply* constructs and I already wanted to start implementing them
>instead of good old for, but then a stroke of lightning came from this
>thread. Based on words from Thomas, lapply should not be used for tasks
>where order is critical. Did I get this clear enough. Additionally, I
>have read notes (I lost link, but was posted on R-help, I think) from
>Thomas on R and he mentioned that it is commonly assumed that *apply* (I
>do not remember which one of *apply*) is faster than loop, but that this
>is not true. Any additional pointers to literature?
>
>
>
More information about the R-help
mailing list