[R] How can I avoid nested 'for' loops or quicken the process?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Dec 26 09:44:19 CET 2008
On Thu, 25 Dec 2008, Oliver Bandel wrote:
> Bert Gunter <gunter.berton <at> gene.com> writes:
>
>>
>> FWIW:
>>
>> Good advice below! -- after all, the first rule of optimizing code is:
>> Don't!
>>
>> For the record (yet again), the apply() family of functions (and their
>> packaged derivatives, of course) are "merely" vary carefully written for()
>> loops: their main advantage is in code readability, not in efficiency gains,
>> which may well be small or nonexistent. True efficiency gains require
>> "vectorization", which essentially moves the for() loops from interpreted
>> code to (underlying) C code (on the underlying data structures): e.g.
>> compare rowMeans() [vectorized] with ave() or apply(..,1,mean).
> [...]
>
> The apply-functions do bring speed-advantages.
>
> This is not only what I read about it,
> I have used the apply-functions and really got
> results faster.
>
> The reason is simple: an apply-function does
> make in C, what otherwise would be done on the level of R
> with for-loops.
Not true of apply(): true of lapply() and hence sapply(). I'll leave you
to check eapply, mapply, rapply, tapply.
So the issue is what is meant by 'the apply() family of functions': people
often mean *apply(), of which apply() is an unusual member, if one at all.
[Historical note: a decade ago lapply was internally a for() loop. I
rewrote it in C in 2000: I also moved apply to C at the same time but it
proved too little an advantage and was reverted. The speed of lapply
comes mainly from reduced memory allocation: for() is also written in C.]
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list