Fokko commented on code in PR #6525: URL: https://github.com/apache/iceberg/pull/6525#discussion_r1069165768
########## python/pyiceberg/avro/reader.py: ########## @@ -238,41 +248,51 @@ def skip(self, decoder: BinaryDecoder) -> None: return self.option.skip(decoder) -class StructProtocolReader(Reader): - create_struct: Callable[[], StructProtocol] - fields: Tuple[Tuple[Optional[int], Reader], ...] +class StructReader(Reader): + field_readers: Tuple[Tuple[Optional[int], Reader], ...] + create_struct: Callable[[StructType], StructProtocol] + struct: StructType - def __init__(self, fields: Tuple[Tuple[Optional[int], Reader], ...], create_struct: Callable[[], StructProtocol]): + def __init__( + self, + field_readers: Tuple[Tuple[Optional[int], Reader], ...], + create_struct: Callable[[StructType], StructProtocol], + struct: StructType, + ) -> None: + self.field_readers = field_readers self.create_struct = create_struct - self.fields = fields + self.struct = struct - def create_or_reuse(self, reuse: Optional[StructProtocol]) -> StructProtocol: - if reuse: - return reuse - else: - return self.create_struct() + def read(self, decoder: BinaryDecoder) -> StructProtocol: + struct = self.create_struct(self.struct) - def read(self, decoder: BinaryDecoder) -> Any: - struct = self.create_or_reuse(None) + if not issubclass(struct.__class__, StructProtocol): Review Comment: This is actually not the best solution. It works when you pass in a constructor, but doesn't work on a lambda. Ideally, we want to do this check when we create the `StructReader` instead of every read. I guess we can optimize this when we start doing re-use and we create the object only once. I've solidified this behavior by adding some tests: ```python def test_read_struct() -> None: mis = MemoryInputStream(b"\x18") decoder = BinaryDecoder(mis) struct = StructType(NestedField(1, "id", IntegerType(), required=True)) result = StructReader(((0, IntegerReader()),), Record, struct).read(decoder) assert repr(result) == 'Record[id=12]' def test_read_struct_lambda() -> None: mis = MemoryInputStream(b"\x18") decoder = BinaryDecoder(mis) struct = StructType(NestedField(1, "id", IntegerType(), required=True)) # You can also pass in an arbitrary function that returns a struct result = StructReader(((0, IntegerReader()),), lambda struct: Record(struct), struct).read(decoder) assert repr(result) == 'Record[id=12]' ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org