singhpk234 opened a new pull request, #11755:
URL: https://github.com/apache/iceberg/pull/11755

   ## About the change 
   
   Recently stumbled on a schema where a column was of struct type but the 
underlying struct was empty, this lead to failure when writing the parquet file 
because : 
   
   ```
   This is not a bug. You cannot write a empty struct in parquet.
   
   This is due to the way the parquet format works, a parquet file only 
consists of leaf field data, the intermediate structure is not stored and can 
be inferred using the schema and the repetition levels and definition levels of 
the written leaf fields. An empty struct (which is written as a group) has no 
leaf fields and that is why parquet fails to write this.
   ```
   
   Taken from : https://issues.apache.org/jira/browse/SPARK-20593
   
   I have not tested for ORC, but seems like we can ban this when updating the 
schema, our spec doesn't say much on this
   
   ```
   A struct is a tuple of typed values. Each field in the tuple is named and 
has an integer id that is unique in the table schema. Each field can be either 
optional or required, meaning that values can (or cannot) be null. Fields may 
be any type. Fields may have an optional comment or doc string. Fields can have 
[default values](https://iceberg.apache.org/spec/?h=spec#default-values).
   ```
   
   will update the spec if there is an alignment here 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to