[R] How can I avoid nested 'for' loops or quicken the process?

Fri Dec 26 12:07:02 CET 2008

Prof Brian Ripley wrote:
> On Thu, 25 Dec 2008, Oliver Bandel wrote:
....
>>
>> The apply-functions do bring speed-advantages.
>>
>> This is not only what I read about it,
>> I have used the apply-functions and really got
>> results faster.
>>
>> The reason is simple: an apply-function does
>> make in C, what otherwise would be done on the level of R
>> with for-loops.
> 
> Not true of apply(): true of lapply() and hence sapply().  I'll leave 
> you to check eapply, mapply, rapply, tapply.
> 
> So the issue is what is meant by 'the apply() family of functions': 
> people often mean *apply(), of which apply() is an unusual member, if 
> one at all.

Conceptually, I think it belongs there. apply(M,1,max) is similar to 
tapply(M,row(M),max), etc. The "apply-functions" share a general 
split-operate-reassemble set of semantics, and apply _could_ be 
implemented as splitting by indices in MARGINS, followed by lapply, 
followed by reassembly into a matrix, as in tapply().

In reality, apply() is implemented differently, using aperm() and direct 
indexing. This is more efficient, but it shouldn't necessarily change 
the way in which we think about it. It is a bit unfortunate that the 
most complex mamber of the family has gotten the most basic name, though.

> [Historical note: a decade ago lapply was internally a for() loop.  I 
> rewrote it in C in 2000: I also moved apply to C at the same time but it 
> proved too little an advantage and was reverted.  The speed of lapply 
> comes mainly from reduced memory allocation: for() is also written in C.]
> 

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907