kevinjqliu commented on issue #584:
URL: https://github.com/apache/iceberg-python/issues/584#issuecomment-2041559077

   > Further research shows that when I use 
[daft](https://www.getdaft.io/projects/docs/en/latest/user_guide/integrations/iceberg.html#reading-a-table)
 that I'm able to read and use the to_arrow() functionality just fine. This is 
interesting especially because daft utilizes pyiceberg.
   
   The column name transformation behavior is part of the Java Iceberg spec 
when reading/writing parquet files. Specifically, the transformed schema is 
pushed down to parquet reader/writer. 
   I suspect this is happening since the Java parquet implementation supports 
both Avro and parquet schema (See [parquet 
cli](https://github.com/apache/parquet-mr/blob/db4183109d5b734ec5930d870cdae161e408ddba/parquet-cli/src/main/java/org/apache/parquet/cli/commands/SchemaCommand.java#L106-L111)).
 So to be compatible with both parquet and Avro schemas, this column name 
transformation behavior is used. 
   
   From what I've seen, libraries in other languages do not do this. This means 
these libraries can read/write parquet files having special characters in their 
column names.
   
   Daft uses the Rust Arrow library which can read parquet files with special 
characters in their column names. 
   Similarly, pyarrow can read it as well. 
   
   I checked major parquet libraries in Python, Rust, Golang and they can all 
support reading special characters in parquet column names.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to