MarcoGorelli opened a new issue, #46301: URL: https://github.com/apache/arrow/issues/46301
### Describe the bug, including details regarding any error messages, version, and platform. ```python import pyarrow as pa import pyarrow.compute as pc import duckdb print(duckdb.sql( """ from values (timestamp '1970-01-01') df(a) select time_bucket('3 years', "a", timestamp '1970-01-01') """ )) print(pc.floor_temporal(pa.array([datetime(1970, 1, 1)]), 3, 'year')) ``` Outputs: ``` ┌────────────────────────────────────────────────────────────┐ │ time_bucket('3 years', a, CAST('1970-01-01' AS TIMESTAMP)) │ │ timestamp │ ├────────────────────────────────────────────────────────────┤ │ 1970-01-01 00:00:00 │ └────────────────────────────────────────────────────────────┘ [ 1968-01-01 00:00:00.000000 ] ``` The DuckDB output differs from the PyArrow one. Given that the pyarrow docs say > By default, the origin is 1970-01-01T00:00:00. I would expect it to be aligned with DuckDB when specifying `timestamp '1970-01-01'` as origin. In fact, if I use `36, 'month'`, then PyArrow also returns `'1970-01-01'`. The fact that `3, 'year'` differs from `3*12, 'month'` suggests to me that there's a bug ```python In [6]: pc.floor_temporal(arr, 3, 'year') Out[6]: <pyarrow.lib.TimestampArray object at 0x7fe44180fca0> [ 1968-01-01 00:00:00.000000 ] In [7]: pc.floor_temporal(arr, 3*12, 'month') Out[7]: <pyarrow.lib.TimestampArray object at 0x7fe443a39540> [ 1970-01-01 00:00:00.000000 ] ``` ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org