[R-SIG-Win] Some time tests with new toolchain

Avraham Adler avraham.adler at gmail.com
Fri Sep 25 20:11:53 CEST 2015


As promised, below are some timing results using Jeroen's recent
4.9.3-based 64bit toolchain. I will hopefully eventually blog on this in
more detail, but there are two suites of tests, one which focuses on
BLAS-related functionality and one that focuses on non-BLAS related math
functionality. I've hosted the reproducible test code and full results
(quartiles, mean, SD, and CV) at [1] in case anyone want to do any
hypothesis testing. My immediate takeaways, there is no substitute for a
fast BLAS if you are doing any matrix operations. I'll probably post a
suggested patch to more easily allow building R on Windows with a
pre-compiled OpenBLAS. This also tempts me very much to tinker under the
hood and see what would be necessary to allow building R on Windows using
an optimized LAPACK as well (Both ATLAS and OpenBLAS allow for building an
optimized LAPACK).

I found some weird results when testing link-time-optimization (LTO) in the
non-BLAS section: using tune=native, it was the fastest; using arch=native,
it was the slowest, though there really isn't much difference between tune
native with and without LTO. For BLAS-related calls it was slower with than
without (although a sample size of 25 may be too small). As compiling with
LTO means that some packages may misbehave when compiled from source and
thus require binary installs (stringi and dplyr are two that I have found)
I'm unsure whether or not changes should be made to the makefiles to make
it a simple call from Mkrules.local, or if it is better to post
instructions on-line somewhere (or to this list) for the enterprising
adventurer to try on his or her own. Your collective thoughts?

There are seven builds tested against each other. The test platform was an
i7-3740QM @ 2.7Ghz with 8MB RAM; Win7 64. The version of R tested (and all
fully passed make check-devel and make check-recommended) was
R-devel_2015-09-10, and the units are in milliseconds. The descriptor
strings should be self-explanatory, Ref means reference BLAS, OPB is
OpenBLAS version 0.2.14, TG is mtune=generic, TN is mtune=native, and AN is
march=native. The results are in the following order. I apologize if the
formatting gets messed up.

* 463-SJLJ-Ref-TG
* 493-SEH-POSIX-Ref-TG
* 493-SEH-POSIX-OPB-TG
* 493-SEH-POSIX-OPB-TN
* 493-SEH-POSIX-OPB-AN
* 493-SEH-POSIX-OPB-TN-LTO
* 493-SEH-POSIX-OPB-AN-LTO

[1] <http://www.avrahamadler.com/SpeedTests2015%20v4.txt>

Thank you,

Avi

BLAS-related:

sort(c(as.vector(A), as.vector(B)))         410.553       478.423
407.874      405.382      406.139      406.254      407.029
det(A)                                      222.413       230.527
27.056       28.022       26.704       28.900       28.956
A %*% B                                     680.316       661.466
40.803       42.848       38.918       40.602       37.693
t(A) %*% B                                  692.379       668.152
52.604       52.422       53.991       52.072       49.743
crossprod(A, B)                           1,191.143     1,198.826
39.234       35.717       39.510       35.319       36.704
solve(A)                                  1,080.882     1,139.137
82.553       83.621       81.876       89.614       89.788
solve(A, t(B))                            1,501.146     1,566.338
90.099       92.029       90.027      101.827      110.558
solve(B)                                  1,074.366     1,091.797
99.424       98.131       99.472      106.990      108.083
chol(A)                                     203.694       277.394
15.455       16.834       16.306       19.880       18.918
chol(B, pivot = TRUE)                         4.436         9.121
4.856        5.015        4.887        5.990        5.915
qr(A, LAPACK = TRUE)                        694.455       697.853
132.854      133.123      131.401      133.232      133.786
svd(A)                                    3,577.719     3,527.317
623.759      630.878      617.791      626.818      632.760
eigen(A, symmetric = TRUE)                1,557.346     1,593.283
290.351      296.038      290.124      293.611      293.482
eigen(A, symmetric = FALSE)               5,938.559     5,710.295
1,361.771    1,409.270    1,440.704    1,425.261    1,488.135
eigen(B, symmetric = FALSE)               6,703.844     6,460.657
4,820.368    4,726.327    4,799.318    4,686.093    4,924.717
lu(A)                                       240.613       293.879
45.978       47.799       45.017       47.169       47.200
fft(A)                                      161.925       167.645
161.838      162.145      157.566      161.197      158.356
Hilbert(3000)                               258.187       462.391
254.183      255.146      254.994      252.812      257.502
toeplitz(A[1:500, 1])                         6.423        12.544
6.595        6.569        6.701        6.618        7.179
princomp(A)                               2,961.546     2,977.561
471.074      479.347      469.696      467.749      474.495



Non-BLAS related:

A + 2                      3.340      3.318      3.325      3.355
 3.398      3.316      3.586
A - 2                      3.440      3.410      3.412      3.379
 3.501      3.375      3.658
A * 2                      3.373      3.404      3.409      3.411
 3.474      3.325      3.654
A/2                        5.232      3.763      3.735      3.747
 3.893      3.796      4.033
A * 0.5                    3.403      3.371      3.391      3.394
 3.476      3.360      3.632
A^2                        3.341      3.396      3.381      3.374
 3.459      3.327      3.673
sqrt(A[1:10000])           0.192      0.189      0.188      0.178
 0.890      0.174      0.884
sin(A[1:10000])            0.641      0.638      0.635      0.611
 1.251      0.612      1.249
A + B                      1.807      1.834      1.839      1.843
 1.833      1.807      1.843
A - B                      1.787      1.842      1.835      1.850
 1.829      1.801      1.848
A * B                      1.797      1.824      1.854      1.858
 1.828      1.828      1.842
A/B                        5.452      2.817      2.811      2.820
 2.816      2.835      2.906
A[1:100000]%%B[1:100000]   3.999      4.439      4.446      4.445
 4.452      4.517      4.445
A[1:100000]%/%B[1:100000]  3.660      4.056      4.079      4.056
 4.075      4.056      4.056

	[[alternative HTML version deleted]]



More information about the R-SIG-windows mailing list