Jackie-Jiang commented on code in PR #15685:
URL: https://github.com/apache/pinot/pull/15685#discussion_r2078521964


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/json/MutableJsonIndexImpl.java:
##########
@@ -117,9 +131,15 @@ private void addFlattenedRecords(List<Map<String, String>> 
records) {
       for (Map.Entry<String, String> entry : record.entrySet()) {
         // Put both key and key-value into the posting list. Key is useful for 
checking if a key exists in the json.
         String key = entry.getKey();
-        _postingListMap.computeIfAbsent(key, k -> new 
RoaringBitmap()).add(_nextFlattenedDocId);
+        _postingListMap.computeIfAbsent(key, k -> {
+          _bytesSize += Utf8.encodedLength(key);

Review Comment:
   Are we only counting the size of the keys?
   
   If we only want to limit the heap usage, we should use the size of the 
underlying byte array instead of the UTF8 format, which is the size in the 
final index file.



##########
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerMeter.java:
##########
@@ -196,7 +196,12 @@ public enum ServerMeter implements AbstractMetrics.Meter {
   PREDOWNLOAD_FAILED("predownloadFailed", true),
 
   // reingestion metrics
-  SEGMENT_REINGESTION_FAILURE("segments", false);
+  SEGMENT_REINGESTION_FAILURE("segments", false),
+
+  /**
+   * Approximate heap bytes used by the mutable JSON index at the time of 
index close.
+   */
+  REALTIME_JSON_INDEX_MEMORY_USAGE("bytes", true);

Review Comment:
   (minor) Consider renaming to match the class name
   ```suggestion
     MUTABLE_JSON_INDEX_MEMORY_USAGE("bytes", true);
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/json/MutableJsonIndexImpl.java:
##########
@@ -722,6 +742,20 @@ public String[] getValuesSV(int[] docIds, int length, 
Map<String, RoaringBitmap>
 
   @Override
   public void close() {
+    try {
+      String tableName = 
SegmentUtils.getTableNameFromSegmentName(_segmentName);
+      _serverMetrics.addMeteredTableValue(tableName, _columnName, 
ServerMeter.REALTIME_JSON_INDEX_MEMORY_USAGE,

Review Comment:
   Please double check if the metric can be properly emitted using `server.yml` 
(under `jmx_prometheus_javaagent`) given you are also adding column name



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to