emkornfield commented on code in PR #16446:
URL: https://github.com/apache/iceberg/pull/16446#discussion_r3284336698
##########
format/spec.md:
##########
@@ -514,12 +514,16 @@ Partition field IDs must be reused if an existing
partition spec contains an equ
| **`truncate[W]`** | Value truncated to width `W` (see below)
| `int`, `long`, `decimal`, `string`, `binary`
| Source type |
| **`year`** | Extract a date or timestamp year, as years from 1970
| `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`
| `int` |
| **`month`** | Extract a date or timestamp month, as months from
1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`,
`timestamptz_ns` | `int` |
-| **`day`** | Extract a date or timestamp day, as days from 1970-01-01
| `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`
| `int` |
+| **`day`** | Extract a date or timestamp day, as days from 1970-01-01
| `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`
| `date` [1] |
| **`hour`** | Extract a timestamp hour, as hours from 1970-01-01
00:00:00 | `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`
| `int` |
| **`void`** | Always produces `null`
| Any
| Source type or `int` |
All transforms must return `null` for a `null` input value.
+Notes:
+
+1. The result type for `day` has been documented as both `int` and `date` in
earlier revisions of this spec. The physical representation has always been a
4-byte integer counting days from `1970-01-01`, regardless of whether the Avro
field is annotated with `logicalType: date`. Readers may encounter manifests in
either form; per the Avro specification, unrecognized logical type annotations
are ignored, so the bytes on disk are identical.
Review Comment:
> per the Avro specification, unrecognized logical type annotations are
ignored, so the bytes on disk are identical.
Small nit, but I don't think this really captures the problem with the
different types? I think all Iceberg compatible readers by definition must
recognize `logicalType: date`. IIUC the issue is that iceberg hasn't defined
the promotion in either direction between `int` and `date`. If we are
deferring to the Avro specification [type
resolution](https://avro.apache.org/docs/1.11.1/specification/#schema-resolution)
might be the more applicable section (since it doesn't seem to consider
logical types, I'm not clear if this is intentional, an oversight. One could
construe this about only talking about physical type or that the types are not
equal, and we are defining here an type-promotion or reconciliation specific to
iceberg.).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]