HonahX commented on issue #290: URL: https://github.com/apache/iceberg-python/issues/290#issuecomment-1905147806
> I noticed that Schema class __eq__ function does not check if the schema_ids are equal. I think this is the intended behavior. We consider two schemas equal if they share the same set of fields and identifier fields, as these factors define the table structure. In contrast, the schema-id relates more to when the schema was added to the table. Say if we want to check if two tables have the same structure, we expect `tbl1.schema() == tbl2.schema()` return `True` if both tables have the same columns definition, even if two schemas have different schema-ids. Another example: in `update_schema()` we rely on this equality check to determine if we need to add the new schema, or we can re-use some previously added one: https://github.com/apache/iceberg-python/blob/a56838dc5d9acc5f0e0d70919bfc433c7d0756f1/pyiceberg/table/__init__.py#L1823-L1831 > In test_add_column, it checks if the two schemas are equal using schema_id=0 which is wrong, it should be schema_id=1 in L602 since schema evolution just happened You're right. I think for this test, we can remove the `schema_id=0` and `identifier_field_ids=[]` in the "expected" schema because they will be initialized to these values in default. We can add another assertion to check the schema-id. Does this fix sound reasonable to you? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org