tarun11Mavani commented on PR #16344: URL: https://github.com/apache/pinot/pull/16344#issuecomment-3154325379
> How will the delays be while committing these segments with this approach? How big is the table (pks, type of pk and size of a segment) and system specs this approach is tested on? IMO, it shouldn't be much and a lot of it will be offset by reduce time in writing//uploading segment as there are less data now. I tested it for high ingestion rate kafka topic (1M rows and 256 partitions) and committed segment every 30 minutes, the delay introduce by additional re-processing was around 1-2s for about 60M rows that we processed. I have a metric COMMIT_TIME_COMPACTION_BUILD_TIME_MS which captures the entire build time and not just the overhead for compacting the segment. I will add similar metric for regular flow and then share the comparison here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
