[R] Resources for optimizing code

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Nov 5 19:17:07 CET 2004


On Fri, 5 Nov 2004, Roger Bivand wrote:

> On Fri, 5 Nov 2004, Janet Elise Rosenbaum wrote:
> 
> > 
> > I want to eliminate certain observations in a large dataframe (21000x100).
> > I have written code which does this using a binary vector (0=delete obs,
> > 1=keep), but it uses for loops, and so it's slow and in the extreme it 
> > causes R to hang for indefinite time periods.
> > 
> > I'm looking for one of two things:
> > 1.  A document which discusses how to avoid for loops and situations in
> > which it's impossible to avoid for loops.
> > 
> > or
> > 
> > 2.  A function which can do the above better than mine.  
> 
> ?subset
> newdata <- subset(DATAFRAME, asst==1)
> 
> which will work whether DATAFRAME is a matrix or data.frame (two different 
> classes).

Sorry, not for matrices:

> A <- matrix(1:20, 5)
> asst <- c(1,0,0,1,0)
> subset(A, asst)
[1]  1  4  6  9 11 14 16 19

Maybe it should, but in biggish problems like this it is almost certainly 
a bit more efficient to use the bare tools, that is indexing.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list