[R-sig-Debian] 50% performance of custom R build compared to PPA R for a command

Scott Kostyshak skostysh at princeton.edu
Fri Apr 25 18:30:37 CEST 2014


On Fri, Apr 25, 2014 at 11:59 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> On 25 April 2014 at 11:38, Scott Kostyshak wrote:
> | On Thu, Apr 24, 2014 at 4:32 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
> | >
> | > Scott,
> | >
> | > My first quick hunches are a) 50% is too much for compiler switches, b) your
> | > examples shows R code, and c) are you sure you are using the same BLAS?
> |
> | Thanks for the quick reply Dirk and for the suggestions.
> |
> | As for BLAS, yes I believe I'm using the same BLAS. The output of the
> | following two commands is the same (except for the memory addresses of
> | course):
> | $ ldd /usr/local/lib/R-devel/lib/R/bin/exec/R
> | $ ldd /usr/lib/R/bin/exec/R
> |
> | And executing
> | $ lsof -p <PID> | grep 'blas\|lapack'
> | also returns the same output for both Rs:
> | R       13017 scott  mem    REG    8,1  9142768 2097161
> | /usr/lib/atlas-base/atlas/liblapack.so.3.0
> | R       13017 scott  mem    REG    8,1  3776592 2097162
> | /usr/lib/atlas-base/atlas/libblas.so.3.0
> |
> | I profiled and it seems that all of the R functions are slow (I can
> | post the output if anyone is interested). I rebuilt with -O3 in CFLAGS
> | and this improved things a lot. Time went down from 10 seconds to 5.7
>
> That is surprisingly large. In my mail yesterday I basically bet against it.
>
> | or so. I reprofiled and again the R functions of R-devel seem just a
> | tad slower across the board (I can send output if interested).
> |
> | Below are some timings comparing the optimized R-devel to R.
> |
> | $ time R-devel CMD BATCH mwe.R
> |
> | real 0m5.755s
> | user 0m5.678s
> | sys 0m0.079s
> |
> | $ time R CMD BATCH mwe.R
> |
> | real 0m5.453s
> | user 0m5.371s
> | sys 0m0.054s
> |
> | Rerunning the above commands multiple times gives about the same output.
> |
> | There's still a .3 second difference and I'm curious to know why. Any ideas?
>
> Different code base?
>
> If you want _identical_ outcomes you need identical _input_: code, compiler,
> settings, hardware, ...

Not looking for identical, just looking to squeeze out something to
learn from about other possibilities for differences, e.g. libraries
that I'm not linking against at compile time, or differences with byte
compiling R. But it doesn't seem like there's any obvious candidates
so I'll stop here for now.

Thanks for the help, Dirk.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University



More information about the R-SIG-Debian mailing list