Re: [PR] InMemory Catalog Implementation [iceberg-python]

via GitHub Sun, 21 Jan 2024 23:38:05 -0800


HonahX commented on code in PR #289:
URL: https://github.com/apache/iceberg-python/pull/289#discussion_r1461380205



##########
pyiceberg/table/__init__.py:
##########
@@ -504,6 +504,12 @@ def _(update: AddSchemaUpdate, base_metadata: 
TableMetadata, context: _TableMeta
     if update.last_column_id < base_metadata.last_column_id:
         raise ValueError(f"Invalid last column id {update.last_column_id}, 
must be >= {base_metadata.last_column_id}")
 
+    # `update.schema_.schema_id` should be the last_schema_id + 1
+    last_schema_id = max(schema.schema_id for schema in base_metadata.schemas)
+    next_schema_id = last_schema_id + 1
+    new_schema = update.schema_.model_copy(update={"schema_id": 
next_schema_id})
+    update = update.model_copy(update={"schema_": new_schema})

Review Comment:
   The `AddSchemaUpdate` should contain a schema with changes applied and 
schema-id incremented. In pyiceberg, we trust `update_schema` API to give us 
the correct one, as I mentioned in this 
[comment](https://github.com/apache/iceberg-python/issues/290#issuecomment-1903304501).
 
   
   Since this PR already updated the `_commit_table` for InMemory Catalog, I 
think we do not need to increment the schema-id again here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] InMemory Catalog Implementation [iceberg-python]

Reply via email to