Re: [I] Create Iceberg Table from pyarrow Schema with no IDs [iceberg-python]

via GitHub Sun, 21 Jan 2024 00:50:45 -0800


HonahX commented on issue #278:
URL: https://github.com/apache/iceberg-python/issues/278#issuecomment-1902557930

Thanks for summarizing the approaches and explanation on the concerns.

> I’m not convinced that we can assign ids without relying on the position
when generating the name mapping.

Same question as @syun64. My understanding is the
`_CreateMappingFromPyArrowSchma` will be very similar to
`assign_fresh_schema_ids` which incremently assigns new ids by position as we
visit the given schema.

I have another related question. If we
1. create a name-mapping from `pa.Table`
2. use the name-mapping to generate a new iceberg schema
3. use the new iceberg schema to create the table.

what do we do with the name-mapping created in step 1 after the table is
created? Do we just discard it or put it in `schema.name-mapping.default`? If
the later, I think we need either to update the
[`new_table_metadata`](https://github.com/apache/iceberg-python/blob/83306104a25a4ecd1f2185ec46cd9fda247544f4/pyiceberg/table/metadata.py#L399-L409)
to not assign fresh ids when a name-mapping present or update the
`_SetFreshIds` to respect name-mapping if given. I would appreciate any
thoughts on this matter!

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Create Iceberg Table from pyarrow Schema with no IDs [iceberg-python]

Reply via email to