syun64 commented on code in PR #305: URL: https://github.com/apache/iceberg-python/pull/305#discussion_r1469643574
########## pyiceberg/schema.py: ########## @@ -1221,50 +1221,57 @@ def assign_fresh_schema_ids(schema_or_type: Union[Schema, IcebergType], next_id: class _SetFreshIDs(PreOrderSchemaVisitor[IcebergType]): """Traverses the schema and assigns monotonically increasing ids.""" - reserved_ids: Dict[int, int] + old_id_to_new_id: Dict[int, int] Review Comment: That makes sense - I'm having an aha moment here, and just wanted to summarize my thoughts... I think we could include the work to enable passing the baseSchema as an input argument to the visitor to [REPLACE TABLE](https://github.com/apache/iceberg-python/issues/281#issuecomment-1905187291) support since that's the function that requires fetching the original IDs from the baseSchema. CREATE TABLE, which is the only operation we've supported so far doesn't need to compare any IDs against a baseSchema. In the java code, [idFor](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/AssignFreshIds.java#L51) function uses the existing ID from the baseSchema by searching it by name if it exists. If it doesn't exist, it assigns the new ID from the counter. Another difference I see with AssignFreshIds in the java code is that it doesn't handle the reassignment of the identifier_field_ids within the visitor. Instead, it returns the schema object and searches the [new identifier field IDs](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/TypeUtil.java#L262) by name. Is there a reason we decided to step away from this approach in the Python implementation? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org