On Fri, 30 May 2008, Prof Brian Ripley wrote: > On Fri, 30 May 2008, Duncan Murdoch wrote: > > > On 5/30/2008 1:55 PM, Prof Brian Ripley wrote: > >> Well, R has no unsigned quantities, so ultimately you can't actually do > >> this. But using what="int" and an appropriate 'size' (likely to be 8) > >> shold read the numbers, wrapping around very large ones to be negative. > >> (The usual trick of storing integers in numeric will lose accuracy, but > >> might be better than nothing.) > > > > I think reading size 8 integers on 32 bit Windows returns signed 32 bit > > integers, with values outside that range losing the high order bits, not > > just > > accuracy. At least that's what I see when I write the numbers 1:10 out as 4 > > byte integers, and read them as 8 byte integers: I get 1 3 5 7 9. > > Yes, that's true for even larger ones. > > So to clarify: up to 2^31-1 should work, thereafter you will get the lower > 32 bits and hence possibly a signed number.
When we wrote a version of readBin() for Splus 8.0 we added an extra argument, output=, that specifies the type of S object to put the result into. The what= argument says what sort of data is in the input file and by default output=what. output="double" can be useful in this case, as a double can store a 53 bit signed or unsigned integer without loss of precision. If the integer is bigger than 2^53-1, the double stores its most significant 53 bits, which may be better than truncating the thing. E.g., I wrote a C program to write some unsigned long longs to a file: #include <stdio.h> int main(int argc, char *argv[]) { unsigned long long data[7], one = 1ULL ; data[0] = one ; data[1] = (one<<31) - 1 ; data[2] = (one<<31) + 1 ; data[3] = (one<<32) - 1 ; data[4] = (one<<32) + 1 ; data[5] = (one<<52) + 1 ; data[6] = (one<<54) + 1 ; (void)fwrite((void *)data, sizeof(data[0]), sizeof(data)/sizeof(data[0]), stdout) ; return 0 ; } od shows what it writes, as unsigned, signed, and hex 8 byte integers: % ./a.out|od --format u8 0000000 1 2147483647 0000020 2147483649 4294967295 0000040 4294967297 4503599627370497 0000060 18014398509481985 0000070 % ./a.out | od --format d8 0000000 1 2147483647 0000020 2147483649 4294967295 0000040 4294967297 4503599627370497 0000060 18014398509481985 0000070 % ./a.out | od --format x8 0000000 0000000000000001 000000007fffffff 0000020 0000000080000001 00000000ffffffff 0000040 0000000100000001 0010000000000001 0000060 0040000000000001 0000070 and in 32-bit Splus I can read it with: > z<-readBin(pipe("./a.out", open="br"), what="integer", n=7, size=8, signed=FALSE, output="double") > print(z, digits=16) [1] 1 2147483647 2147483649 4294967295 [5] 4294967297 4503599627370497 18014398509481984 Note that it loses precision where z[7]>2^53. Without the output="double" then the numbers > 2^32 would be truncated and the signs would be wrong on ones between 2^31 anbd 2^32: > readBin(pipe("./a.out", open="br"), what="integer", n=7, size=8, signed=FALSE) [1] 1 2147483647 -2147483647 -1 1 1 [7] 1 (That one gives the same result in R and Splus.) What do folks think about having this option in R? ---------------------------------------------------------------------------- Bill Dunlap Insightful Corporation bill at insightful dot com "All statements in this message represent the opinions of the author and do not necessarily reflect Insightful Corporation policy or position." ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel