gviedma commented on issue #10560: URL: https://github.com/apache/pinot/issues/10560#issuecomment-1500397690
I think it would be useful if we could measure thread CPU time and memory required for consumption on a per segment/partition basis (similar to what we do for query execution statistics on the broker in `ExecutionStatsAggregator` by leveraging `ThreadResourceUsageProvider`). It might also be beneficial to further break up these metrics by "phase". For example, it may be useful to publish fine-grained CPU time/memory metrics for the various phases of ingestion: 1. when the consumer fetches a batch of messages (call to `partitionGroupConsumer.fetchMessages`), 2. the messages are decoded (call to `streamDataDecoder.decode`), 3. the rows are processed by the transform pipeline (call to `transformPipeline.processRow`) and 4. the rows are indexed (call to `realtimeSegment.index`). In addition, it would also be useful to have an end-to-end metric that combines all of the above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org