[Rd] how to manipulate dput output format
simon.urbanek at r-project.org
Mon Jun 25 17:17:18 CEST 2012
On Jun 25, 2012, at 10:20 AM, andre zege wrote:
> dput() is intended to be parsed by R so the above is not possible without massaging the output. But why in the would would you use dput() for something that you want to read in Java? Why don't you use a format that Java can read easily - such as JSON?
> Yeap, except i was just working with someone elses choice. Bigmatrix code uses dput() to dump desc file of filebacked matrices.
Ah, ok, that is indeed rather annoying as it's pretty much the most non-portable storage (across programs) one could come up with. (I presume you're talking about big.matrix from bigmemory?)
> I got some time to do a little hack of reading big matrices nicely to java and was looking to some ways of smoothing the edges of parsing .desc file a little. I guess i am ok now with parsing .desc with some regex. One thing i am still wondering about is whether i really need to convert back and forth between liitle endian and big endian. Namely, java platform has little endian native byte order, and big matrix code writes stuff in big endian. It'd be nice if i could manipulate that by some #define somewhere in the makefile or something and make C++ write little endian without byte swapping every time i need to communicate with big matrix from java.
I think you're wrong (if we are talking about bigmemory) - the endianness is governed by the platform as far as I can see. On little-endian machines the big matrix storage is little endian and on big-endian machines it is big-endian.
It's very peculiar that the descriptor doesn't even store the endianness - I think you could talk to the authors and suggest that they include most basic information such as endianness and, possibly, change the format to something that is well-defined without having to evaluate it in R (which is highly dangerous and a serious security risk).
More information about the R-devel