[R-SIG-Win] Some time tests with new toolchain
Avraham Adler
avraham.adler at gmail.com
Fri Sep 25 20:11:53 CEST 2015
As promised, below are some timing results using Jeroen's recent
4.9.3-based 64bit toolchain. I will hopefully eventually blog on this in
more detail, but there are two suites of tests, one which focuses on
BLAS-related functionality and one that focuses on non-BLAS related math
functionality. I've hosted the reproducible test code and full results
(quartiles, mean, SD, and CV) at [1] in case anyone want to do any
hypothesis testing. My immediate takeaways, there is no substitute for a
fast BLAS if you are doing any matrix operations. I'll probably post a
suggested patch to more easily allow building R on Windows with a
pre-compiled OpenBLAS. This also tempts me very much to tinker under the
hood and see what would be necessary to allow building R on Windows using
an optimized LAPACK as well (Both ATLAS and OpenBLAS allow for building an
optimized LAPACK).
I found some weird results when testing link-time-optimization (LTO) in the
non-BLAS section: using tune=native, it was the fastest; using arch=native,
it was the slowest, though there really isn't much difference between tune
native with and without LTO. For BLAS-related calls it was slower with than
without (although a sample size of 25 may be too small). As compiling with
LTO means that some packages may misbehave when compiled from source and
thus require binary installs (stringi and dplyr are two that I have found)
I'm unsure whether or not changes should be made to the makefiles to make
it a simple call from Mkrules.local, or if it is better to post
instructions on-line somewhere (or to this list) for the enterprising
adventurer to try on his or her own. Your collective thoughts?
There are seven builds tested against each other. The test platform was an
i7-3740QM @ 2.7Ghz with 8MB RAM; Win7 64. The version of R tested (and all
fully passed make check-devel and make check-recommended) was
R-devel_2015-09-10, and the units are in milliseconds. The descriptor
strings should be self-explanatory, Ref means reference BLAS, OPB is
OpenBLAS version 0.2.14, TG is mtune=generic, TN is mtune=native, and AN is
march=native. The results are in the following order. I apologize if the
formatting gets messed up.
* 463-SJLJ-Ref-TG
* 493-SEH-POSIX-Ref-TG
* 493-SEH-POSIX-OPB-TG
* 493-SEH-POSIX-OPB-TN
* 493-SEH-POSIX-OPB-AN
* 493-SEH-POSIX-OPB-TN-LTO
* 493-SEH-POSIX-OPB-AN-LTO
[1] <http://www.avrahamadler.com/SpeedTests2015%20v4.txt>
Thank you,
Avi
BLAS-related:
sort(c(as.vector(A), as.vector(B))) 410.553 478.423
407.874 405.382 406.139 406.254 407.029
det(A) 222.413 230.527
27.056 28.022 26.704 28.900 28.956
A %*% B 680.316 661.466
40.803 42.848 38.918 40.602 37.693
t(A) %*% B 692.379 668.152
52.604 52.422 53.991 52.072 49.743
crossprod(A, B) 1,191.143 1,198.826
39.234 35.717 39.510 35.319 36.704
solve(A) 1,080.882 1,139.137
82.553 83.621 81.876 89.614 89.788
solve(A, t(B)) 1,501.146 1,566.338
90.099 92.029 90.027 101.827 110.558
solve(B) 1,074.366 1,091.797
99.424 98.131 99.472 106.990 108.083
chol(A) 203.694 277.394
15.455 16.834 16.306 19.880 18.918
chol(B, pivot = TRUE) 4.436 9.121
4.856 5.015 4.887 5.990 5.915
qr(A, LAPACK = TRUE) 694.455 697.853
132.854 133.123 131.401 133.232 133.786
svd(A) 3,577.719 3,527.317
623.759 630.878 617.791 626.818 632.760
eigen(A, symmetric = TRUE) 1,557.346 1,593.283
290.351 296.038 290.124 293.611 293.482
eigen(A, symmetric = FALSE) 5,938.559 5,710.295
1,361.771 1,409.270 1,440.704 1,425.261 1,488.135
eigen(B, symmetric = FALSE) 6,703.844 6,460.657
4,820.368 4,726.327 4,799.318 4,686.093 4,924.717
lu(A) 240.613 293.879
45.978 47.799 45.017 47.169 47.200
fft(A) 161.925 167.645
161.838 162.145 157.566 161.197 158.356
Hilbert(3000) 258.187 462.391
254.183 255.146 254.994 252.812 257.502
toeplitz(A[1:500, 1]) 6.423 12.544
6.595 6.569 6.701 6.618 7.179
princomp(A) 2,961.546 2,977.561
471.074 479.347 469.696 467.749 474.495
Non-BLAS related:
A + 2 3.340 3.318 3.325 3.355
3.398 3.316 3.586
A - 2 3.440 3.410 3.412 3.379
3.501 3.375 3.658
A * 2 3.373 3.404 3.409 3.411
3.474 3.325 3.654
A/2 5.232 3.763 3.735 3.747
3.893 3.796 4.033
A * 0.5 3.403 3.371 3.391 3.394
3.476 3.360 3.632
A^2 3.341 3.396 3.381 3.374
3.459 3.327 3.673
sqrt(A[1:10000]) 0.192 0.189 0.188 0.178
0.890 0.174 0.884
sin(A[1:10000]) 0.641 0.638 0.635 0.611
1.251 0.612 1.249
A + B 1.807 1.834 1.839 1.843
1.833 1.807 1.843
A - B 1.787 1.842 1.835 1.850
1.829 1.801 1.848
A * B 1.797 1.824 1.854 1.858
1.828 1.828 1.842
A/B 5.452 2.817 2.811 2.820
2.816 2.835 2.906
A[1:100000]%%B[1:100000] 3.999 4.439 4.446 4.445
4.452 4.517 4.445
A[1:100000]%/%B[1:100000] 3.660 4.056 4.079 4.056
4.075 4.056 4.056
[[alternative HTML version deleted]]
More information about the R-SIG-windows
mailing list