Fokko opened a new issue, #723:
URL: https://github.com/apache/iceberg-rust/issues/723

   Name mapping is used when the files in the table don't have field-IDs 
encoded in the Parquet files. For example, when adding files through 
`add_files` in the case of a table migration from Hive, the Parquet files don't 
have field-IDs in them. In this case we want to make use of name-mapping: 
https://iceberg.apache.org/spec/#name-mapping-serialization This is a JSON blob 
that's stored alongside the table in a table property. 
   
   This issue is solely on the deserialization of the JSON blob into a memory 
structure. Tests can be found here: 
https://github.com/apache/iceberg-python/blob/main/tests/table/test_name_mapping.py
   
   Future tip: It is best to store this in a recursive field so it can be 
traversed using a `VisitorWithParent` where both a `Schema` and `NameMapping` 
can be traversed at once. This is important because we cannot flatten the 
name-mapping because of potential dots in the field name, and this disallows us 
to split between fields and subfields. This is done in PyIceberg here: 
https://github.com/apache/iceberg-python/pull/1014


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to