This is an automated email from the ASF dual-hosted git repository.
felipecrv pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new b10386e6a0 GH-44248: [Format] Add TimestampWithOffset canonical
extension type (#48002)
b10386e6a0 is described below
commit b10386e6a0ea5f3bbe6c00dcd124819b7fbf3071
Author: Lucas Valente <[email protected]>
AuthorDate: Fri Dec 5 17:34:49 2025 +0100
GH-44248: [Format] Add TimestampWithOffset canonical extension type (#48002)
### Rationale for this change
Closes #44248
Arrow has no built-in canonical way of representing the `TIMESTAMP WITH
TIME ZONE` SQL type, which is present across multiple different database
systems. Not having a native way to represent this forces users to
either convert to UTC and drop the time zone, which may have correctness
implications, or use bespoke workarounds. A new
`arrow.timestamp_with_offset` extension type would introduce a standard
canonical way of representing that information.
Rust implementation: https://github.com/apache/arrow-rs/pull/8743
Go implementation: https://github.com/apache/arrow-go/pull/558
[DISCUSS] [thread in the mailing
list](https://lists.apache.org/thread/yhbr3rj9l59yoxv92o2s6dqlop16sfnk).
### What changes are included in this PR?
Proposal and documentation for `arrow.timestamp_with_offset` canonical
extension type.
### Are these changes tested?
N/A
### Are there any user-facing changes?
Yes, this is an extension to the arrow format.
* GitHub Issue: #44248
---------
Co-authored-by: David Li <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Co-authored-by: Felipe Oliveira Carvalho <[email protected]>
---
docs/source/format/CanonicalExtensions.rst | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/docs/source/format/CanonicalExtensions.rst
b/docs/source/format/CanonicalExtensions.rst
index 8608a6388e..697e7627d8 100644
--- a/docs/source/format/CanonicalExtensions.rst
+++ b/docs/source/format/CanonicalExtensions.rst
@@ -544,6 +544,33 @@ Primitive Type Mappings
| UUID extension type | UUID |
+----------------------+------------------------+
+.. _timestamp_with_offset_extension:
+
+Timestamp With Offset
+=====================
+This type represents a timestamp column that stores potentially different
timezone offsets per value. The timestamp is stored in UTC alongside the
original timezone offset in minutes.
+This extension type is intended to be compatible with ANSI SQL's ``TIMESTAMP
WITH TIME ZONE``, which is supported by multiple database engines.
+
+* Extension name: ``arrow.timestamp_with_offset``.
+
+* The storage type of the extension is a ``Struct`` with 2 fields, in order:
+
+ * ``timestamp``: a non-nullable ``Timestamp(time_unit, "UTC")``, where
``time_unit`` is any Arrow ``TimeUnit`` (s, ms, us or ns).
+
+ * ``offset_minutes``: a non-nullable signed 16-bit integer (``Int16``)
representing the offset in minutes from the UTC timezone. Negative offsets
represent time zones west of UTC, while positive offsets represent east.
Offsets normally range from -779 (-12:59) to +780 (+13:00).
+
+* Extension type parameters:
+
+ This type does not have any parameters.
+
+* Description of the serialization:
+
+ Extension metadata is an empty string.
+
+.. note::
+
+ It is also *permissible* for the ``offset_minutes`` field to be
dictionary-encoded or run-end-encoded.
+
Community Extension Types
=========================