[Rd] RE: [R] when can we expect Prof Tierney's compiled R?

Vadim Ogranovich vograno at evafunds.com
Fri Apr 22 21:02:06 CEST 2005


If we are on the subject of byte compilation, let me bring a couple of
examples which have been puzzling me for some time. I'd like to know a)
if the compilation will likely to improve the performance for this type
of computations, and b) at least roughly understand the reasons for the
observed numbers, specifically why x[i]<- assignment is so much slower
than x[i] extraction.

The loops below are typical in any recursive calculation where the new
value of a vector is based on its immediate neighbor say to the left.
Specifically we assign the previous value to the current element.

# this shows that the assignment x[i]<- is the bottleneck in the loop
> n = 1e6; iA = seq(2,n); x = double(n); system.time(for (i in iA) x[i]
= x[i-1])
[1] 4.29 0.00 4.30 0.00 0.00
> n = 1e6; iA = seq(2,n); x = double(n); system.time(for (i in iA)
x[i-1])
[1] 1.46 0.01 1.46 0.00 0.00

# the overhead of the loop itself is reasonably low, just 0.17 sec
> n = 1e6; iA = seq(2,n); x = double(n); system.time(for (i in iA) 1)
[1] 0.17 0.01 0.17 0.00 0.00

# pure assignment (w/o the extraction x[i]) takes 3.09 sec. Thus x[i] as
extraction is (3.09 - 0.17)/(0.79 - 0.17) = 4.7 times faster than x[i]<-
as assignment. This looks a bit odd.
> n = 1e6; iA = seq(2,n); x = double(n); system.time(for (i in iA) x[i]
= 1.0)
[1] 3.08 0.00 3.09 0.00 0.00


# this shows that just evaluation of (i-1) takes about (0.79 - 0.24) =
0.55 sec on my machine (AMD 64 bit). Looks too slow.
> n = 1e6; iA = seq(2,n); x = double(n); system.time(for (i in iA) i-1)
[1] 0.79 0.00 0.79 0.00 0.00
> n = 1e6; iA = seq(2,n); x = double(n); system.time(for (i in iA) i)
[1] 0.24 0.01 0.24 0.00 0.00

Thanks,
Vadim

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Luke Tierney
> Sent: Friday, April 22, 2005 7:33 AM
> To: Peter Dalgaard
> Cc: Jason Liao; r-help at stat.math.ethz.ch
> Subject: Re: [R] when can we expect Prof Tierney's compiled R?
> 
> On Wed, 20 Apr 2005, Peter Dalgaard wrote:
> 
> > Luke Tierney <luke at stat.uiowa.edu> writes:
> >
> >> Vectorized operations in R are also as fast as compiled C (because 
> >> that is what they are :-)).  A compiler such as the one 
> I'm working 
> >> on will be able to make most difference for 
> non-vectorizable or not 
> >> very vectorizable code.  It may also be able to reduce the 
> need for 
> >> intermediate allocations in vectorizable code, which may 
> have other 
> >> benefits beyond just speed improvements.
> >
> > Actually, it has struck me a couple of times that these 
> operations are 
> > not as fast as they could be, since they are outside the 
> scope of fast 
> > BLAS routines, but "embarrassingly parallel" code could easily be 
> > written for the relevant hardware. Even on uniprocessor 
> systems there 
> > might be speedups that the C compiler cannot find (e.g. because it 
> > cannot assume that source and destination of the operation are 
> > distinct).
> 
> My guess is that for anything beyond basic operations we are 
> doing OK on uniprocessors. but it would be useful to do some 
> testing to be sure.  For the basic operations I suspect we 
> are paying a heavy price for the way we handle recycling, 
> both in terms of overhead as such and in terms of inhibiting 
> compiler optimizations. For performance it would probably be 
> better to code the scalar-vector, equal-length-vector, and 
> general cases separately, though keeping the code 
> maintainable may be a bit of a challenge.  Again testing on a 
> range of platforms and compilers would be useful.
> 
> With multiprocessors likely to become more widely available 
> it would be good to look into ways of factoring the 
> vectorized math code so we can slide in one that uses threads 
> when approprate.  This should dovetail nicely with 
> compilation to identify larger vectorized expressions that 
> can be parallelized as a unit; I hope to look into this a bit 
> this summer.
> 
> luke
> 
> 
> 
> --
> Luke Tierney
> Chair, Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>     Actuarial Science
> 241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>



More information about the R-devel mailing list