fengjiajie commented on PR #8808:
URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1771924888

   > how are we guaranteed that the binary is parsable as UTF8 bytes?
   
   @RussellSpitzer Thank you for participating in the review.
   If a column is not encoded in UTF-8, it should not be defined as a string 
type in the iceberg metadata. 
   
   The data reading type should be determined based on the column type 
definition in the iceberg metadata, rather than the column type definition in 
the parquet file. An imperfect analogy would be reading a CSV file where the 
column type is determined by the table's structural metadata during reading, 
rather than the type defined in the CSV file itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to