[R] Improving data processing efficiency
hadley wickham
h.wickham at gmail.com
Sat Jun 7 00:55:04 CEST 2008
On Fri, Jun 6, 2008 at 5:10 PM, Daniel Folkinshteyn <dfolkins at gmail.com> wrote:
> Hmm... ok... so i ran the code twice - once with a preallocated result,
> assigning rows to it, and once with a nrow=0 result, rbinding rows to it,
> for the first 20 quarters. There was no speedup. In fact, running with a
> preallocated result matrix was slower than rbinding to the matrix:
>
> for preallocated matrix:
> Time difference of 1.577779 mins
>
> for rbinding:
> Time difference of 1.498628 mins
>
> (the time difference only counts from the start of the loop til the end, so
> the time to allocate the empty matrix was /not/ included in the time count).
>
> So, it appears that rbinding a matrix is not the bottleneck. (That it was
> actually faster than assigning rows could have been a random anomaly (e.g.
> some other process eating a bit of cpu during the run?), or not - at any
> rate, it doesn't make an /appreciable/ difference.
Why not try profiling? The profr package provides an alternative
display that I find more helpful than the default tools:
install.packages("profr")
library(profr)
p <- profr(fcn_create_nonissuing_match_by_quarterssinceissue(...))
plot(p)
That should at least help you see where the slow bits are.
Hadley
--
http://had.co.nz/
More information about the R-help
mailing list