Re: [PR] `create_table` with a PyArrow Schema [iceberg-python]

via GitHub Mon, 29 Jan 2024 18:03:24 -0800


syun64 commented on code in PR #305:
URL: https://github.com/apache/iceberg-python/pull/305#discussion_r1470494285



##########
pyiceberg/schema.py:
##########
@@ -1221,50 +1221,57 @@ def assign_fresh_schema_ids(schema_or_type: 
Union[Schema, IcebergType], next_id:
 class _SetFreshIDs(PreOrderSchemaVisitor[IcebergType]):
     """Traverses the schema and assigns monotonically increasing ids."""
 
-    reserved_ids: Dict[int, int]
+    old_id_to_new_id: Dict[int, int]

Review Comment:
   I tried this out, and I think we might have to leave this logic in for this 
iteration. 
   
   The usage of create_table in PyIceberg is slightly different from the Java 
implementation in that the user can create an IcebergSchema, instead of a Spark 
DF schema or a PyArrow schema. This means that the user can choose to include 
identifier_field_ids in the IcebergSchema that they define.
   
   I think it might make sense to consolidate the way we track the 
identifier_field_ids on create_table, update_schema, replace_table and other 
similar operations in a future PR. I'll create an issue to track this task



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] `create_table` with a PyArrow Schema [iceberg-python]

Reply via email to