vrajat commented on code in PR #12157: URL: https://github.com/apache/pinot/pull/12157#discussion_r1475535154
########## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java: ########## @@ -140,6 +142,8 @@ public class PinotLLCRealtimeSegmentManager { // Max time to wait for all LLC segments to complete committing their metadata while stopping the controller. private static final long MAX_LLC_SEGMENT_METADATA_COMMIT_TIME_MILLIS = 30_000L; + private Map<Pair<String, String>, SegmentErrorInfo> _errorCache; Review Comment: One point is that this is for debug APIs only. Errors from this list is not exposed to the user. The official user interface is metrics and logs. Based on the fact that the error list is for debugging only: 1. High error rate: Limiting the size is a good idea so that it doesnt use up memory. Losing errors is OK as this is not the source of truth and only for debugging. Right now there is only one error source. If there are too many, the important aspect is that a data loss occurred. 2. Segment lifecycle: Since it is not the source of truth and used by devs only, mismatch is OK. The main contribution of the PR is the metric to track data loss and the log with all the necessary info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org