stevenzwu commented on code in PR #16688:
URL: https://github.com/apache/iceberg/pull/16688#discussion_r3376115870


##########
core/src/main/java/org/apache/iceberg/TrackedFile.java:
##########
@@ -49,6 +49,9 @@ interface TrackedFile {
   Types.NestedField FILE_SIZE_IN_BYTES =
       Types.NestedField.required(
           104, "file_size_in_bytes", Types.LongType.get(), "Total file size in 
bytes");
+  Types.NestedField WRITER_FORMAT_VERSION =
+      Types.NestedField.required(

Review Comment:
   Possible third option that satisfies both arguments above. If 
`writer_format_version` is just the table format version of the writer, with 
`0` reserved for pre-v4:
   
   ```
   0 = pre-v4 (sentinel)
   4 = v4
   5 = v5 (future)
   ...
   ```
   
   This keeps the field required + explicit-value (no null-interpretation rule 
on the read side) while avoiding the parallel versioning system. 
`writer_format_version=4` reads as "v4 leaf" without translation; when v5 
lands, `5` is the obvious value.
   
   Migration cost is the same as the current spec proposal: a v4 upgrade cannot 
cheaply distinguish v1/v2/v3 leaves at the root level (the manifest list does 
not carry per-leaf format version; only each leaf's own avro header does), so 
they all collapse to `0`. If precise pre-v4 versions ever matter to a reader, 
the leaf's own header has them.
   
   Would spec PR #16025 be open to changing `1: V4` to `4: V4`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to