navina commented on PR #9544: URL: https://github.com/apache/pinot/pull/9544#issuecomment-1272803951
@vvivekiyer Not sure how much value a discussion can offer if the PR has already been merged. But here is my take on this: > MessageBatch is a generic interface because users of Pinot are free to use their custom kafka (or other) client implementations that could return messages in any format. I understand the flexibility that this generic MessageBatch provides. But we want to get a more stronger interface contract so that the development of a plugin becomes trivial and streamlined. Features we can add: * Filtering records based on message metadata (benefit: avoid payload de-serialization cost) * Compute time to deserialize a payload * Streamline handling of decode failures > As you mentioned, the new code assumes that when messages are read from the stream consumer, they will always be in serialized format. But the existing code for MessageBatch<T> and StreamMessageDecoder<T> doesn't honor the assumption. Yes. that was the whole point of changing to the new code. I was trying to de-couple the "decoding" of a message from "fetching" of a message from the stream. <img width="546" alt="Screen Shot 2022-10-10 at 10 51 20 AM" src="https://user-images.githubusercontent.com/1909480/194803076-aa074171-b322-4d5a-86a7-b1a4d326e651.png"> > Just to give more clarity about linkedin's custom implementation of interfaces: > LiKafkaMessageBatch implements MessageBatch<IndexedRecord> > LiKafkaConsumer extends PartitionLevelConsumer > LiKafkaDecoder implements StreamMessageDecoder<IndexedRecord> Iirc, Linkedin has multiple client libraries and I am fairly certain that almost all of them, except `linked-kafka-clients` allow you to customize the serde/deserde in their clients. Moreover, Linkedin's avro decoder is also a separate library that can be used in your custom implementation as `LiKafkaDecoder implements StreamMessageDecoder<byte[]>` Can you help me understand why this approach cannot be used by Linkedin pinot ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org