On Jun 25, 2012, at 11:57 AM, andre zege wrote: > > > On Mon, Jun 25, 2012 at 11:17 AM, Simon Urbanek <simon.urba...@r-project.org> > wrote: > > On Jun 25, 2012, at 10:20 AM, andre zege wrote: > > > dput() is intended to be parsed by R so the above is not possible without > > massaging the output. But why in the would would you use dput() for > > something that you want to read in Java? Why don't you use a format that > > Java can read easily - such as JSON? > > > > Cheers, > > Simon > > > > > > > > > > > > Yeap, except i was just working with someone elses choice. Bigmatrix code > > uses dput() to dump desc file of filebacked matrices. > > Ah, ok, that is indeed rather annoying as it's pretty much the most > non-portable storage (across programs) one could come up with. (I presume > you're talking about big.matrix from bigmemory?) > > > > I got some time to do a little hack of reading big matrices nicely to java > > and was looking to some ways of smoothing the edges of parsing .desc file a > > little. I guess i am ok now with parsing .desc with some regex. One thing > > i am still wondering about is whether i really need to convert back and > > forth between liitle endian and big endian. Namely, java platform has > > little endian native byte order, and big matrix code writes stuff in big > > endian. It'd be nice if i could manipulate that by some #define somewhere > > in the makefile or something and make C++ write little endian without byte > > swapping every time i need to communicate with big matrix from java. > > I think you're wrong (if we are talking about bigmemory) - the endianness is > governed by the platform as far as I can see. On little-endian machines the > big matrix storage is little endian and on big-endian machines it is > big-endian. > > It's very peculiar that the descriptor doesn't even store the endianness - I > think you could talk to the authors and suggest that they include most basic > information such as endianness and, possibly, change the format to something > that is well-defined without having to evaluate it in R (which is highly > dangerous and a serious security risk). > > Cheers, > Simon > > > > I would assume that hardware should dictate endianness, just like you said. > However, the fact is that bigmemory writes in different endianness than java > reads in. I simply compare matrices that i write using bigmemory and that I > read into java. Unless i transform endianness, i get gargabe, and if i swap > byte order, i get the same matrix as the one i wrote. So, i don't think i am > wrong about that, but i am curious about why it happens and whether it is > possible to let bigmemory code write in natural endianness. Then i would not > need to transform each double array element back and forth. >
I think it has to do with the way you read it in Java since Java supports either endianness directly. What methods do you use exactly to read it? The on-disk storage is definitely native-endian so C/C++/... can simply mmap it with no swapping. Cheers, Simon ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel