[R] How can I avoid nested 'for' loops or quicken the process?

Fri Dec 26 09:44:19 CET 2008

On Thu, 25 Dec 2008, Oliver Bandel wrote:

> Bert Gunter <gunter.berton <at> gene.com> writes:
>
>>
>> FWIW:
>>
>> Good advice below! -- after all, the first rule of optimizing code is:
>> Don't!
>>
>> For the record (yet again), the apply() family of functions (and their
>> packaged derivatives, of course) are "merely" vary carefully written for()
>> loops: their main advantage is in code readability, not in efficiency gains,
>> which may well be small or nonexistent. True efficiency gains require
>> "vectorization", which essentially moves the for() loops from interpreted
>> code to (underlying) C code (on the underlying data structures): e.g.
>> compare rowMeans() [vectorized] with ave() or apply(..,1,mean).
> [...]
>
> The apply-functions do bring speed-advantages.
>
> This is not only what I read about it,
> I have used the apply-functions and really got
> results faster.
>
> The reason is simple: an apply-function does
> make in C, what otherwise would be done on the level of R
> with for-loops.

Not true of apply(): true of lapply() and hence sapply().  I'll leave you 
to check eapply, mapply, rapply, tapply.

So the issue is what is meant by 'the apply() family of functions': people 
often mean *apply(), of which apply() is an unusual member, if one at all.

[Historical note: a decade ago lapply was internally a for() loop.  I 
rewrote it in C in 2000: I also moved apply to C at the same time but it 
proved too little an advantage and was reverted.  The speed of lapply 
comes mainly from reduced memory allocation: for() is also written in C.]

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595