pvary commented on PR #10944: URL: https://github.com/apache/iceberg/pull/10944#issuecomment-2290851832
I forgot about the `AvroReader` when we had our offline discussion. This PR become bigger than I have anticipated. It is good in itself, but I have my doubts about the AvroReader since I have been working on the watermark related features. I think we need to have a more generic, and easier way to manipulate the output of the source to emit the desired type of objects (RowData/Avro/UserDefined). Thinking along the line of the Kafka `org.apache.kafka.common.serialization.Deserializer` interface. The user could provide the deserializer, and we can use it in the `SerializableRecordEmitter` to convert the raw iceberg record to the desired value. We could provide the `RawDataDeserializer` and the `AvroDeserializer` to archive the current functionality, and probably deprecate the whole `ReaderFunction` altogether. I wanted to try this concept for a long time now, but the lack of time prevented me to do so. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org