Jackie-Jiang commented on code in PR #15685: URL: https://github.com/apache/pinot/pull/15685#discussion_r2078521964
########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/json/MutableJsonIndexImpl.java: ########## @@ -117,9 +131,15 @@ private void addFlattenedRecords(List<Map<String, String>> records) { for (Map.Entry<String, String> entry : record.entrySet()) { // Put both key and key-value into the posting list. Key is useful for checking if a key exists in the json. String key = entry.getKey(); - _postingListMap.computeIfAbsent(key, k -> new RoaringBitmap()).add(_nextFlattenedDocId); + _postingListMap.computeIfAbsent(key, k -> { + _bytesSize += Utf8.encodedLength(key); Review Comment: Are we only counting the size of the keys? If we only want to limit the heap usage, we should use the size of the underlying byte array instead of the UTF8 format, which is the size in the final index file. ########## pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerMeter.java: ########## @@ -196,7 +196,12 @@ public enum ServerMeter implements AbstractMetrics.Meter { PREDOWNLOAD_FAILED("predownloadFailed", true), // reingestion metrics - SEGMENT_REINGESTION_FAILURE("segments", false); + SEGMENT_REINGESTION_FAILURE("segments", false), + + /** + * Approximate heap bytes used by the mutable JSON index at the time of index close. + */ + REALTIME_JSON_INDEX_MEMORY_USAGE("bytes", true); Review Comment: (minor) Consider renaming to match the class name ```suggestion MUTABLE_JSON_INDEX_MEMORY_USAGE("bytes", true); ``` ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/json/MutableJsonIndexImpl.java: ########## @@ -722,6 +742,20 @@ public String[] getValuesSV(int[] docIds, int length, Map<String, RoaringBitmap> @Override public void close() { + try { + String tableName = SegmentUtils.getTableNameFromSegmentName(_segmentName); + _serverMetrics.addMeteredTableValue(tableName, _columnName, ServerMeter.REALTIME_JSON_INDEX_MEMORY_USAGE, Review Comment: Please double check if the metric can be properly emitted using `server.yml` (under `jmx_prometheus_javaagent`) given you are also adding column name -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org