farkmarnum opened a new issue, #83: URL: https://github.com/apache/arrow-js/issues/83
### Describe the bug, including details regarding any error messages, version, and platform. While constructing a `Vector` of `Timestamp` values, any pre-epoch values end up offset by approximately `2^32 * 10^(3n)`, where `n` depends on the precision. <details> <summary> I've written a script for minimal repro using the latest version of the apache-arrow NPM package (12.0.0). </summary> ```javascript // test.mjs import { Date_, makeBuilder, Timestamp, TimeUnit, DateUnit, } from "apache-arrow"; /** * @param {Date} value * @param {TimeUnit} precision * @returns {number} */ const testTimestamp = (value, precision) => { const columnBuilder = makeBuilder({ type: new Timestamp(precision) }); columnBuilder.append(value); const vec = columnBuilder.finish().toVector(); const valueInVec = Array.from(vec)[0]; return +value - +valueInVec; }; console.log("Testing TIMESTAMP\n"); [ ["2000-01-01", TimeUnit.NANOSECOND], ["2000-01-01", TimeUnit.MICROSECOND], ["2000-01-01", TimeUnit.MILLISECOND], ["2000-01-01", TimeUnit.SECOND], [], ["1900-01-01", TimeUnit.NANOSECOND], ["1900-01-01", TimeUnit.MICROSECOND], ["1900-01-01", TimeUnit.MILLISECOND], ["1900-01-01", TimeUnit.SECOND], [], ["1969-12-31 23:59:59Z", TimeUnit.NANOSECOND], ["1969-01-01", TimeUnit.NANOSECOND], ["1900-01-01", TimeUnit.NANOSECOND], ["1800-01-01", TimeUnit.NANOSECOND], ["1700-01-01", TimeUnit.NANOSECOND], ].forEach(([dateStr, precision]) => { if (!dateStr) { console.log(); return; } const outcome = testTimestamp(new Date(dateStr), precision); const label = `${dateStr} w/ ${TimeUnit[precision]}`; console.log(`${label.padEnd(40)} => ${outcome}`); }); console.log("\n\nTesting DATE\n"); /** * @param {Date} value * @returns {Date} */ const testDate = (value, precision) => { const columnBuilder = makeBuilder({ type: new Date_(DateUnit.DAY) }); columnBuilder.append(value); const vec = columnBuilder.finish().toVector(); const valueInVec = Array.from(vec)[0]; return valueInVec; }; [ ["1969-12-15 01:00:00Z", "rounded up"], ["1969-12-31 01:00:00Z", "rounded up"], ["1970-01-01 00:00:00Z", "rounded down"], ["1970-01-01 23:00:00Z", "rounded down"], ["1970-01-15 01:00:00Z", "rounded down"], ].forEach(([dateStr, label]) => { const date = new Date(dateStr); const outcome = testDate(date); console.log(`${date.toISOString()} => ${outcome.toISOString()} (${label})`); }); ``` </details> Here's the output: ``` Testing TIMESTAMP 2000-01-01 w/ NANOSECOND => 0 2000-01-01 w/ MICROSECOND => 0 2000-01-01 w/ MILLISECOND => 0 2000-01-01 w/ SECOND => 0 1900-01-01 w/ NANOSECOND => -4294.96728515625 1900-01-01 w/ MICROSECOND => -4294967.2958984375 1900-01-01 w/ MILLISECOND => -4294967296 1900-01-01 w/ SECOND => -4294967296000 1969-12-31 23:59:59Z w/ NANOSECOND => -4294.967296 1969-01-01 w/ NANOSECOND => -4294.967296600342 1900-01-01 w/ NANOSECOND => -4294.96728515625 1800-01-01 w/ NANOSECOND => -4294.9677734375 1700-01-01 w/ NANOSECOND => -4294.966796875 Testing DATE 1969-12-15T01:00:00.000Z => 1969-12-16T00:00:00.000Z (rounded up) 1969-12-31T01:00:00.000Z => 1970-01-01T00:00:00.000Z (rounded up) 1970-01-01T00:00:00.000Z => 1970-01-01T00:00:00.000Z (rounded down) 1970-01-01T23:00:00.000Z => 1970-01-01T00:00:00.000Z (rounded down) 1970-01-15T01:00:00.000Z => 1970-01-15T00:00:00.000Z (rounded down) ``` As you can see, for the `Timestamp` type, values post-epoch pass through unscathed, but values pre-epoch end up off by a fixed increment of: - approximately `2^32 / 1,000,000` for `NANOSECOND` precision - approximately `2^32 / 1,000` for `MICROSECOND` precision - `2^32` for `MILLISECOND` precision - `2^32 * 1,000` for `SECOND` precision Note: I say "fixed", but the amount actually varies slightly when it's a float (for `NANOSECOND` and `MICROSECOND` precisions), as you can see in the last block of tests for `Timestamp` -- it varies a bit, seemingly varying more as the value gets further from the epoch. It seems like the Arrow `Timestamp` type is a 64-bit int of {precision} since the epoch, but is represented as two 32-bit ints in JS. Something is getting messed up for pre-epoch timestamps, which have a negative number for their internal representation. I'm guessing it has to do with 32bit arithmetic and/or float precision issues. Additionally, for the `Date_` type, values post-epoch seem to "round down" to the nearest day properly, but values pre-epoch seem to "round up" to the next day. I'm guessing that this is a related issue. ### Component(s) JavaScript -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org