emkornfield commented on issue #13855:
URL: https://github.com/apache/iceberg/issues/13855#issuecomment-3259294123

   > @emkornfield Are we guaranteeing that all optional fields even when it is 
null would be written in the file going forward based on the spec changes you 
mentioned?
   
   It was my impression that the general consensus was all columns from the 
schema must be written even for optional when they are all null. But even if 
they aren't there are two cases:
   1.  At the time of write the column did not exist in the schema and 
therefore would not have existed in stats (I believe linking the schema solves 
this use-case).
   2. At the time of the write the column did exist in the schema.  We can 
determine from stats whether the column is all null or not , whether it was 
actually written is immaterial.  The main thing that would need to change is to 
optionally record null counts when the are equal to row counts (there might be 
some edge cases for Array types) in implementations (my understanding is most 
implementations only have static config today for stats).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to