pvary commented on PR #10944:
URL: https://github.com/apache/iceberg/pull/10944#issuecomment-2290851832

   I forgot about the `AvroReader` when we had our offline discussion. This PR 
become bigger than I have anticipated.
   It is good in itself, but I have my doubts about the AvroReader since I have 
been working on the watermark related features.
   
   I think we need to have a more generic, and easier way to manipulate the 
output of the source to emit the desired type of objects 
(RowData/Avro/UserDefined). Thinking along the line of the Kafka 
`org.apache.kafka.common.serialization.Deserializer` interface. The user could 
provide the deserializer, and we can use it in the `SerializableRecordEmitter` 
to convert the raw iceberg record to the desired value. We could provide the 
`RawDataDeserializer` and the `AvroDeserializer` to archive the current 
functionality, and probably deprecate the whole `ReaderFunction` altogether.
   
   I wanted to try this concept for a long time now, but the lack of time 
prevented me to do so.
   WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to