jadami10 commented on PR #13298: URL: https://github.com/apache/pinot/pull/13298#issuecomment-2361386066
> the earlier metric was really noisy since it relies on time column value instead of ingestion time which lead to false positives. That sounds like a misuse of the `StreamMessageMetadata`. There's 2 fields in https://github.com/apache/pinot/blob/master/pinot-spi/src/main/java/org/apache/pinot/spi/stream/StreamMessageMetadata.java#L71, `getRecordIngestionTimeMs` and `getFirstStreamRecordIngestionTimeMs` with 2 corresponding metrics to distinguish between between source time and publish time. > would making this metric configurable help? that way you'd be able to disable it without code changes Only if we don't have a way to cap frequency. And it should be off by default. > let me also see it there's a way to reduce frequency I think a key part here is we need to cap the frequency. For large Pinot deployments, you may have thousands of tables and hundreds of thousands of partitions consumed. So the baseline is O(100k) calls. But adding a new table consuming N partitions shouldn't add N more calls. We effectively need a global throttle, though I don't think there's a way to prevent starvation with large enough scale. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org