[I] Able to parse name-mapping into a recusive structure. [iceberg-rust]

via GitHub Wed, 27 Nov 2024 04:40:38 -0800


Fokko opened a new issue, #723:
URL: https://github.com/apache/iceberg-rust/issues/723


   Name mapping is used when the files in the table don't have field-IDs 
encoded in the Parquet files. For example, when adding files through 
`add_files` in the case of a table migration from Hive, the Parquet files don't 
have field-IDs in them. In this case we want to make use of name-mapping: 
https://iceberg.apache.org/spec/#name-mapping-serialization This is a JSON blob 
that's stored alongside the table in a table property. 
   
   This issue is solely on the deserialization of the JSON blob into a memory 
structure. Tests can be found here: 
https://github.com/apache/iceberg-python/blob/main/tests/table/test_name_mapping.py
   
   Future tip: It is best to store this in a recursive field so it can be 
traversed using a `VisitorWithParent` where both a `Schema` and `NameMapping` 
can be traversed at once. This is important because we cannot flatten the 
name-mapping because of potential dots in the field name, and this disallows us 
to split between fields and subfields. This is done in PyIceberg here: 
https://github.com/apache/iceberg-python/pull/1014


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Able to parse name-mapping into a recusive structure. [iceberg-rust]

Reply via email to