On Mon, Jun 25, 2012 at 1:08 PM, Simon Urbanek <simon.urba...@r-project.org>wrote:
> > On Jun 25, 2012, at 11:57 AM, andre zege wrote: > > > > > > > On Mon, Jun 25, 2012 at 11:17 AM, Simon Urbanek < > simon.urba...@r-project.org> wrote: > > > > On Jun 25, 2012, at 10:20 AM, andre zege wrote: > > > > > dput() is intended to be parsed by R so the above is not possible > without massaging the output. But why in the would would you use dput() for > something that you want to read in Java? Why don't you use a format that > Java can read easily - such as JSON? > > > > > > Cheers, > > > Simon > > > > > > > > > > > > > > > > > > Yeap, except i was just working with someone elses choice. Bigmatrix > code uses dput() to dump desc file of filebacked matrices. > > > > Ah, ok, that is indeed rather annoying as it's pretty much the most > non-portable storage (across programs) one could come up with. (I presume > you're talking about big.matrix from bigmemory?) > > > > > > > I got some time to do a little hack of reading big matrices nicely to > java and was looking to some ways of smoothing the edges of parsing .desc > file a little. I guess i am ok now with parsing .desc with some regex. One > thing i am still wondering about is whether i really need to convert back > and forth between liitle endian and big endian. Namely, java platform has > little endian native byte order, and big matrix code writes stuff in big > endian. It'd be nice if i could manipulate that by some #define somewhere > in the makefile or something and make C++ write little endian without byte > swapping every time i need to communicate with big matrix from java. > > > > I think you're wrong (if we are talking about bigmemory) - the > endianness is governed by the platform as far as I can see. On > little-endian machines the big matrix storage is little endian and on > big-endian machines it is big-endian. > > > > It's very peculiar that the descriptor doesn't even store the endianness > - I think you could talk to the authors and suggest that they include most > basic information such as endianness and, possibly, change the format to > something that is well-defined without having to evaluate it in R (which is > highly dangerous and a serious security risk). > > > > Cheers, > > Simon > > > > > > > > I would assume that hardware should dictate endianness, just like you > said. However, the fact is that bigmemory writes in different endianness > than java reads in. I simply compare matrices that i write using bigmemory > and that I read into java. Unless i transform endianness, i get gargabe, > and if i swap byte order, i get the same matrix as the one i wrote. So, i > don't think i am wrong about that, but i am curious about why it happens > and whether it is possible to let bigmemory code write in natural > endianness. Then i would not need to transform each double array element > back and forth. > > > > I think it has to do with the way you read it in Java since Java supports > either endianness directly. What methods do you use exactly to read it? The > on-disk storage is definitely native-endian so C/C++/... can simply mmap it > with no swapping. > > Cheers, > Simon > > > It's my first week doing Java, actually:),I simply did the following to read binary file public static double[] readVector(String fileName) throws IOException{ FileChannel rChannel = new RandomAccessFile(new File(fileName), "r").getChannel(); DoubleBuffer dBuf = rChannel.map(FileChannel.MapMode.READ_ONLY, 0, rChannel.size()).asDoubleBuffer(); double [] vData = new double[(int) rChannel.size()/8]; dBuf.get(vData); return vData; } i just realized that DoubleBuffer is derived from BytBuffer and reading Java 5 doc for ByteBuffer i see "The initial order of a byte buffer is always BIG_ENDIAN".So in fact i just need to check ByteOrder and change it if it's different from native. So, correct code should look like this it seems public static double[] readVector(String fileName) throws IOException{ FileChannel rChannel = new RandomAccessFile(new File(fileName), "r").getChannel(); MappedByteBuffer mbb= rChannel.map(FileChannel.MapMode.READ_ONLY, 0, rChannel.size()); if(mbb.order() != ByteOrder.nativeOrder()) mbb.order(ByteOrder.nativeOrder()); DoubleBuffer dBuf = mbb.asDoubleBuffer(); double [] vData = new double[(int) rChannel.size()/8]; dBuf.get(vData); System.out.println(vData); return vData; } Sorry for the confusion and thanks for the lesson, Simon :) [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel