[R-sig-hpc] Intel Phi Coprocessor?

Mon Jun 10 16:00:03 CEST 2013

thanks, simon.  for me, it is all about running the same code faster,
i.e., without much optimization.  so no phi for me.  An Intel i7 costs
about $100 more (of a b-o-m cost of $1,000) than an i5.   I presume this
is still worth it, because library parallel does use threads.  correct?

[this is asking too much, but here is a related quick question.  does
stock R take advantage of SSE?  SSEx?  AVX?  AVX2?  are these all
single-float based which stock R does not support anyway?]

/iaw

----
Ivo Welch (ivo.welch at gmail.com)
http://www.ivo-welch.info/
J. Fred Weston Professor of Finance
Anderson School at UCLA, C519
Director, UCLA Anderson Fink Center for Finance and Investments
Free Finance Textbook, http://book.ivo-welch.info/
Editor, Critical Finance Review, http://www.critical-finance-review.org/

On Mon, Jun 10, 2013 at 5:15 AM, Simon Urbanek
<simon.urbanek at r-project.org> wrote:
> On Jun 10, 2013, at 1:44 AM, ivo welch wrote:
>
>> does R run on the intel phi coprocessor?  the intel literature makes it seem as if it can be treated just like a 50-core 200-thread just-like-i686 processor running linux, albeit with only 8GB of very fast shared RAM.  some posts have suggested it can be 2-3 times as fast as two high-end Intel Xeon 8-core machines.  how do simple library(parallel) R tasks scale on it?
>>
>
> Given that R is not thread-safe and almost everything (apart from parallel BLAS) is single-threaded, it's exactly the opposite of what you need for R. Explicit parallelization in R has overhead and cannot use threads, so you're better off with higher clock speed than large number of cores (unless you use those explicitly for particular tasks but writing your own low-level code). I was not able to test phi, but generally, in our experience scaling to many cores does not work very well, in particular when you have so little RAM (the only way parallel can scale is by running multiple processes which limits the amount of memory sharing that can be done). So, the way I see it you'd have to treat phi like GPU: you'll be able to leverage the speeds that are claimed by very specific code and algorithms written for it (or, e.g. by running BLAS on it if that's what you do often), but it will be much slower than Xenons for regular use of R. Your mileage may vary - this is just my personal experience evaluating high-core machines (250+) and R (the lesson was it's better to get multiple low-core, high-clockspeed, high-RAM machines instead - the opposite of phi), not particularly with phi.
>
> Cheers,
> Simon
>
>
>> regards,
>>
>> /iaw
>>
>> ----
>> Ivo Welch (ivo.welch at gmail.com)
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>
>>
>