laskoviymishka commented on code in PR #16446:
URL: https://github.com/apache/iceberg/pull/16446#discussion_r3289900825


##########
format/spec.md:
##########
@@ -514,12 +514,16 @@ Partition field IDs must be reused if an existing 
partition spec contains an equ
 | **`truncate[W]`** | Value truncated to width `W` (see below)                 
    | `int`, `long`, `decimal`, `string`, `binary`                              
                                | Source type |
 | **`year`**        | Extract a date or timestamp year, as years from 1970     
    | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`      
                                | `int`       |
 | **`month`**       | Extract a date or timestamp month, as months from 
1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, 
`timestamptz_ns`                                      | `int`       |
-| **`day`**         | Extract a date or timestamp day, as days from 1970-01-01 
    | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`      
                                | `int`       |
+| **`day`**         | Extract a date or timestamp day, as days from 1970-01-01 
    | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`      
                                | `date` [1]  |
 | **`hour`**        | Extract a timestamp hour, as hours from 1970-01-01 
00:00:00  | `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`        
                                      | `int`       |
 | **`void`**        | Always produces `null`                                   
    | Any                                                                       
                                | Source type or `int` |
 
 All transforms must return `null` for a `null` input value.
 
+Notes:
+
+1. The result type for `day` has been documented as both `int` and `date` in 
earlier revisions of this spec. The physical representation has always been a 
4-byte integer counting days from `1970-01-01`, regardless of whether the Avro 
field is annotated with `logicalType: date`. Readers may encounter manifests in 
either form; per the Avro specification, unrecognized logical type annotations 
are ignored, so the bytes on disk are identical.

Review Comment:
   Good catch, agreed. 
   
   Dropped the Avro-spec reference in the latest revision. The note now just 
states the physical representation and the writer/reader contract directly, 
without leaning on "Avro ignores unknown annotations" framing.
   
   Re your deeper point on Iceberg not having defined `int ↔ date` promotion: I 
think that's a separate spec gap worth its own issue rather than expanding this 
PR. Happy to open one as a follow-up if you agree it's worth tracking.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to