singhpk234 opened a new pull request, #11755: URL: https://github.com/apache/iceberg/pull/11755
## About the change Recently stumbled on a schema where a column was of struct type but the underlying struct was empty, this lead to failure when writing the parquet file because : ``` This is not a bug. You cannot write a empty struct in parquet. This is due to the way the parquet format works, a parquet file only consists of leaf field data, the intermediate structure is not stored and can be inferred using the schema and the repetition levels and definition levels of the written leaf fields. An empty struct (which is written as a group) has no leaf fields and that is why parquet fails to write this. ``` Taken from : https://issues.apache.org/jira/browse/SPARK-20593 I have not tested for ORC, but seems like we can ban this when updating the schema, our spec doesn't say much on this ``` A struct is a tuple of typed values. Each field in the tuple is named and has an integer id that is unique in the table schema. Each field can be either optional or required, meaning that values can (or cannot) be null. Fields may be any type. Fields may have an optional comment or doc string. Fields can have [default values](https://iceberg.apache.org/spec/?h=spec#default-values). ``` will update the spec if there is an alignment here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org