[R] R badly lags matlab on performance?
Stavros Macrakis
macrakis at alum.mit.edu
Mon Jan 5 00:38:40 CET 2009
On Sun, Jan 4, 2009 at 4:50 PM, <luke at stat.uiowa.edu> wrote:
> On Sun, 4 Jan 2009, Stavros Macrakis wrote:
>> On Sat, Jan 3, 2009 at 7:02 PM, <luke at stat.uiowa.edu> wrote:
>>> R's interpreter is fairly slow due in large part to the allocation of
>>> argument lists and the cost of lookups of variables,
I'd think another problem is call-by-need. I suppose inlining or
batch analyzing groups of functions helps there.
>>> including ones like [<- that are assembled and looked up as strings on every call.
>> Wow, I had no idea the interpreter was so awful. Just some simple tree-to-tree transformations would speed things up, I'd think, e.g. `<-`(`[`(...), ...) ==> `<-[`(...,...).
> 'Awful' seems a bit strong.
Well, I haven't looked at the code, but if I'm interpreting "assembled
and looked up as strings on every call" correctly, this means taking
names, expanding them to strings, concatenating them, re-interning
them, then looking up the value. That sounds pretty awful to me both
in the sense of being inefficient and of being ugly.
>> I'd think that one of the challenges will be the dynamic types --...
> I am for now trying to get away without declarations and pre-testing
> for the best cases before passing others off to the current internal
> code.
Have you considered using Java bytecodes and taking advantage of
dynamic compilers like Hotspot? They often do a good job in cases
like this by assuming that types are fairly predictable from one run
to the next of a piece of code. Or is the Java semantic model too
different?
> ...There is always a trade-off in complicating the code and the consequences for maintainability that implies.
Agreed entirely!
> A 1.5 factor difference here I find difficult to get excited about, but it might be worth a look.
I agree. The 1.5 isn't a big deal at all.
-s
More information about the R-help
mailing list