anentropic opened a new issue, #1337: URL: https://github.com/apache/iceberg-python/issues/1337
### Apache Iceberg version 0.8.0 (latest release) ### Please describe the bug 🐞 It looks like`transform` is intended to be an optional field (?): ```python class SortField(IcebergBaseModel): """Sort order field. Args: source_id (int): Source column id from the table’s schema. transform (str): Transform that is used to produce values to be sorted on from the source column. This is the same transform as described in partition transforms. direction (SortDirection): Sort direction, that can only be either asc or desc. null_order (NullOrder): Null order that describes the order of null values when sorted. Can only be either nulls-first or nulls-last. """ def __init__( self, source_id: Optional[int] = None, transform: Optional[Union[Transform[Any, Any], Callable[[IcebergType], Transform[Any, Any]]]] = None, direction: Optional[SortDirection] = None, null_order: Optional[NullOrder] = None, **data: Any, ): if source_id is not None: data["source-id"] = source_id if transform is not None: data["transform"] = transform if direction is not None: data["direction"] = direction if null_order is not None: data["null-order"] = null_order super().__init__(**data) ``` But if I don't specify `SortField(source_id=field.field_id)` or pass None `SortField(source_id=field.field_id, transform=None)` then I get pydantic validation error: ``` ValidationError: 1 validation error for SortField transform Field required [type=missing, input_value={'source-id': 4, 'directi...: NullOrder.NULLS_FIRST}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing ``` `SortField(source_id=field.field_id, transform=IdentityTransform())` works `SortField(source_id=field.field_id, transform=IDENTITY)` also works, but type checkers don't like it I think both problems stem from here: ```python transform: Annotated[ # type: ignore Transform, BeforeValidator(parse_transform), PlainSerializer(lambda c: str(c), return_type=str), # pylint: disable=W0108 WithJsonSchema({"type": "string"}, mode="serialization"), ] = Field() ``` the type annotation doesn't make it `Optional` and `BeforeValidator(parse_transform)` uses `parse_transform` to turn the `IDENTITY` string constant into `IdentityTransform()` so the type you pass doesn't match the annotation for the latter one, there is a method here https://docs.pydantic.dev/2.0/usage/types/custom/#handling-third-party-types that would allow passing string constant that is converted to an instance of the annotated `Transform` type -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org