[R-SIG-Mac] How to make Mac 64-bit version feature complete?

Thu Dec 15 05:49:51 CET 2011

G'day,

I understand almost none of this (I deal with dogs and cows normally) but historically I had some R packages that would only work 32 bit (RPgSQL was one) , so it was handy to be able to run both 32 and 64 bit instances concurrently. It seems less of an issue now for me (but I'm on a relatively new machine and now use RPostgreSQL!) and most things run in 64 bit.

I guess it's more about R than Mac programming.

I'll shut up and go back to being thankful that (for me as an epidemiologist) it just works.

cheers

Ben

On 15/12/2011, at 10:56 AM, Simon Urbanek wrote:

> On Dec 14, 2011, at 3:42 PM, Adam Strzelecki wrote:
> 
>> I guess this discussion is going nowhere. Since I am always wrong I give up.
>> 
> 
> I would consider understanding/learning a more productive approach than surrender, but it's up to you ;).
> 
> From the beginning you were confusing several entirely unrelated topics (plus unrelated subject), so I'll try for the last time to make it more clear and separate them properly and address them again:
> 
> a) 3-way fat R.app (your proposal) vs 2-way fat R.app + 1-way R64.app (CRAN release).
> This is entirely a choice of convention and we have decided to separate 32-bit binaries from the 64-bit binary for convenience of the user. As I said there are several reason for this. Whether you like it or not is quite irrelevant - it may not be what Safari does but then Safari is not a stat computing language. Whether you use lipo to add the 64-bit binary to R.app or not is purely cosmetic, it doesn't change the fundamental functionality (except that with R.app+R64.app you can run them in parallel and start the desired architecture more easily - and the user knows what he gets and doesn't need to guess - less relevant for Safari, more so for R).
> 
> 
> b) multi-lib approach in R itself: (for packages, modules etc.). This is deliberate and I was illustrating that it is very common even in OS X as you can see in the gcc example. The main reason is portability because this way the build system needs to be implemented only once for all platforms. This is hidden from the user so you should not care. If you do, you may need to read up on it.
> 
> 
> c) decision to run 32-bit vs 64-bit binaries. This we leave entirely to the user. You may not be familiar with the details (the link you quoted below is quite irrelevant in the sense that the biggest waste of memory is not on the stack - unless you use deep recursion), but each of the architectures has its weaknesses and advantages. For low-memory machines it is often advisable to use 32-bit binary as the benefits of the 64-bit instruction set are really not that big (and the numerical stability is different, some would say worse). 64-bit binary *always* uses more memory, by definition, that's not even a question. The question is only how much more, and in some practical settings it can be close to 100% more. But, again, we leave this to the user.
> 
> 
> d) "How to make Mac 64-bit version feature complete?" Well, the CRAN 64-bit version of R *is* feature complete so the question goes off an invalid assumption. Later you mentioned rgl, but that is not part of R.
> 
> 
> 
>> I say having both R.app, R64.app is doubtful (talking about apps from users perspective), you justify that with dumps of some GCC directories from Xcode developer tools (talking about developer tools directories from developer perspective). Can you find me single other app for Mac that comes as two .app packages for each architecture from one install?
>> 
>> I say installing R in 64-bit Linux and launching R GUI app
> 
> There is no such thing on Linux. R.app exists only on OS X. Windows has Rgui and it has two of them as well (32-bit and 64-bit) and they also come with two separate icons, same as on OS X. Go figure :).
> 
> 
>> launches 64-bit version (talking how Linux hides its internals), you say I am wrong because you can launch 32-bit app using --arch param. Yes you can, does it prove it is wrong what I said? Typing "R" in OSX command line also launches R in 64-bit, why not 32-bit then?
>> 
> 
> Because this decision is made at install time depending on your OS X version - it is 32-bit for Leopard, for example. Whether this is the right thing to do is up for discussion (when installed from source you'll get the last installed architecture). R.app doesn't need to make that decision, since it is always 32-bit - easy.
> 
> I hope that helps understanding it a bit better. If not, please, consider learning more about R.
> 
> Cheers,
> Simon
> 
> 
>> I say 64-bit code is faster (haven't used "always", but intentionally used word "code" not "program" or "libraries", because these can be badly ported to 64-bit due old compiler or 32-bit only hand optimized assembly code not working for 64-bits, I refer to the machine code that has more wider registers than in 32-bit mode), you say it is not; because it can be slower of faster depending on task, huh :/
>> 
> 
> Yep. If you know a bit more about CPUs, you'll realize that we width of the register is quite irrelevant until you actually use it - which is not very often the case (there are very few actual 64-bit operations). The most time you use it is with pointers where you would have used 32-bit on x86 anyway so you don't gain anything, rather to the contrary. You can gain from having more registers, but only for functions with larger number of arguments (register passing) or functions that are complex enough (fn body). Also larger objects (due to double the size of pointers) mean more very slow operations (out-of-cache memory I/O) and thus slower performance. Most gains you see are in fact due to something different - the fact that x86_64 CPUs can be assumed to have extensions that some old x86 CPUs did not have (various SIMD instructions etc.) which can be used instead of the FPU.
> 
> Even if you don't know much about Intel CPU architecture, just run some benchmarks in R - you'll see that it goes either way - you'll have tasks that are slower in 64-bit and you'll have other tasks that are faster in 64-bit (you may be able to dig up examples from this list). It really depends on things like the memory I/O load + data size, types of vectors you operate on, proportion of interpreted code etc. Since R is an interpreter, you'll see bigger negative impact due to increased 64-bit pointer size than you would see in native code (the most vulnerable are generic vectors, lists, environments and character vectors - which are incidentally the most used objects).
> 
> 
>> You say 64-bit code always use more memory, then you've probably read that: http://software.intel.com/en-us/blogs/2010/07/01/the-reasons-why-64-bit-programs-require-more-stack-memory/
>> 
>> Altogether I am just plain wrong :)
>> 
>> Thank you,
>> -- 
>> Adam Strzelecki
>> 
> 
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac