JFinis opened a new issue, #9740:
URL: https://github.com/apache/iceberg/issues/9740

   ### Apache Iceberg version
   
   1.4.3 (latest release)
   
   ### Query engine
   
   None
   
   ### Please describe the bug 🐞
   
   I'm referring to the definition of `field_summary`, which is as follows:
   
   optional | optional | 510 lower_bound | bytes [1] | Lower bound for the 
non-null, non-NaN values in the partition field, or null if all values are null 
or NaN [2]
   optional | optional | 511 upper_bound | bytes [1] | Upper bound for the 
non-null, non-NaN values in the partition field, or null if all values are null 
or NaN [2]
   
   The fields are `optional` and the semantics of optional fields is that any 
writer may decide to not write them for whatever reason it pleases. However, 
for these fields, null has a specific meaning (that there are no non-null, 
non-NaN values). With the current wording of the spec, a reader cannot rely on 
this meaning, as a writer could write null either because there are no non-null 
non-NaN values, or because it chooses not to write this field for other reasons.
   
   Mitigation:
   The spec should put an asterisk at the optional classification of the field 
and explain in a foot note that a writer is not allowed to just not write these 
fields, as null has a specific meaning here. Thus, the fields are something 
between required and optional. They are actually required with the null value 
having a specific semantics. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to