corleyma commented on issue #1079: URL: https://github.com/apache/iceberg-python/issues/1079#issuecomment-2299646746
This proposed change made me wonder if I may have been implicitly relying on this behavior (bootstrapping tables with a schema, but then ingesting new data that may evolve the schema...) E.g., I've been doing something like: ``` with iceberg_table.update_schema(allow_incompatible_changes=allow_incompatible_changes) as update: update.union_by_name(sanitized_pyiceberg_schema) ``` As I looked into it, it looks like the relevant codepath for this is slightly different, and [lives here](https://github.com/apache/iceberg-python/blob/main/pyiceberg/table/__init__.py#L2881). In my opinion, the ideal behavior would be to take the existing field doc if there's no doc defined for the field in the new schema, otherwise update... and it looks like that is what was being implemented as well in the link above, except I think there's a bug in the condition? i.e., shouldn't it be `if field.doc is not None and field.doc != existing_field.doc:`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org