syun64 commented on code in PR #305:
URL: https://github.com/apache/iceberg-python/pull/305#discussion_r1469643574


##########
pyiceberg/schema.py:
##########
@@ -1221,50 +1221,57 @@ def assign_fresh_schema_ids(schema_or_type: 
Union[Schema, IcebergType], next_id:
 class _SetFreshIDs(PreOrderSchemaVisitor[IcebergType]):
     """Traverses the schema and assigns monotonically increasing ids."""
 
-    reserved_ids: Dict[int, int]
+    old_id_to_new_id: Dict[int, int]

Review Comment:
   That makes sense - I'm having an aha moment here, and just wanted to 
summarize my thoughts...
   
   I think we could include the work to enable passing the baseSchema as an 
input argument to the visitor to [REPLACE 
TABLE](https://github.com/apache/iceberg-python/issues/281#issuecomment-1905187291)
 support since that's the function that requires fetching the original IDs from 
the baseSchema. CREATE TABLE, which is the only operation we've supported so 
far doesn't need to compare any IDs against a baseSchema. In the java code, 
[idFor](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/AssignFreshIds.java#L51)
 function uses the existing ID from the baseSchema by searching it by name if 
it exists. If it doesn't exist, it assigns the new ID from the counter.
   
   Another difference I see with AssignFreshIds in the java code is that it 
doesn't handle the reassignment of the identifier_field_ids within the visitor. 
Instead, it returns the schema object and searches the [new identifier field 
IDs](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/TypeUtil.java#L262)
 by name. Is there a reason we decided to step away from this approach in the 
Python implementation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to