Re: [I] Create Iceberg Table from pyarrow Schema with no IDs [iceberg-python]

via GitHub Fri, 19 Jan 2024 06:58:28 -0800


syun64 commented on issue #278:
URL: https://github.com/apache/iceberg-python/issues/278#issuecomment-1900574539


   That sounds good @Fokko 
   
   I think having a _CreateMappingFromPyArrowSchma preorder visitor does a good 
job of separating out the two concerns above. 
   
   > I think the outcome will be the same as the pre-order visitor, but we 
don’t do it by position, but by name.
   
   I think this bit about not doing it by position is catching me a bit off 
guard because I’m not convinced that we can assign ids without relying on the 
position when generating the name mapping. Just to make sure we are on the same 
page, this new Visitor will:
   1. Map field_ids from PyArrow Schema if the field_id exists
   2. Have a Boolean flag to _assign_fresh_ids by ignoring existing field_ids 
(or an automatic fallback to assign ids if field_ids don’t exist) and assign 
field ids **by position**
   
   And then, we will use the name mapping generated from the pyarrow schema to 
assign field ids **by name** and create a new Iceberg Schema.
   
   Does that approach sound consistent with your current thought?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Create Iceberg Table from pyarrow Schema with no IDs [iceberg-python]

Reply via email to