emkornfield commented on issue #13855: URL: https://github.com/apache/iceberg/issues/13855#issuecomment-3259294123
> @emkornfield Are we guaranteeing that all optional fields even when it is null would be written in the file going forward based on the spec changes you mentioned? It was my impression that the general consensus was all columns from the schema must be written even for optional when they are all null. But even if they aren't there are two cases: 1. At the time of write the column did not exist in the schema and therefore would not have existed in stats (I believe linking the schema solves this use-case). 2. At the time of the write the column did exist in the schema. We can determine from stats whether the column is all null or not , whether it was actually written is immaterial. The main thing that would need to change is to optionally record null counts when the are equal to row counts (there might be some edge cases for Array types) in implementations (my understanding is most implementations only have static config today for stats). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
