omarsmak edited a comment on issue #324:
URL: 
https://github.com/apache/camel-kafka-connector/issues/324#issuecomment-661698329


   @nidhijwt the issue here as I understood, is the amount of records being 
sent, for example for every kafka record, it will be an equivalent azure append 
record which may hit the 50,000 limit pretty fast, especially if you have noise 
records which not meant to be inserted (you can ignore these with SMTs). 
However, the way I see it, you will need somehow to aggregate these records and 
insert them as few batches as possible as you mentioned.  From the top of my 
head, you have these options:
   
   1. Write an aggregator using for example [Kafka 
Streams](https://kafka.apache.org/20/documentation/streams/developer-guide/dsl-api.html#aggregating)
 that will aggregate the messages on rolling based into an output Kafka topic 
that has the aggregated records. And then, you can sink this output kafka topic 
using `CamelAzurestorageblobSinkConnector`.
   
   1. Use Camel aggregationStrategy which @oscerd mentioned that he will add 
some documentation about it in the next release which I guess it could be more 
practical option as you don't need to maintain an extra application like Kafka 
Streams to handle these aggregations.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to