Jackie-Jiang commented on code in PR #14479: URL: https://github.com/apache/pinot/pull/14479#discussion_r1868317711
########## pinot-core/src/test/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManagerTest.java: ########## @@ -637,6 +646,26 @@ public void testEndCriteriaChecking() segmentDataManager._timeSupplier.set(endTime); Assert.assertTrue(segmentDataManager.invokeEndCriteriaReached()); } + + // test end criteria reached if any of the index cannot take more rows + try (FakeRealtimeSegmentDataManager segmentDataManager = createFakeSegmentManager(false, new TimeSupplier(), null, + null, null, true)) { + segmentDataManager._state.set(segmentDataManager, RealtimeSegmentDataManager.State.INITIAL_CONSUMING); + Assert.assertFalse(segmentDataManager.invokeEndCriteriaReached()); + + Field realtimeSegmentField = RealtimeSegmentDataManager.class.getDeclaredField("_realtimeSegment"); Review Comment: Using reflection is not reliable. Consider adding a package private method `canAddMore()` into `RealtimeSegmentDataManager` and we can override it here ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java: ########## @@ -1265,6 +1281,10 @@ private boolean isAggregateMetricsEnabled() { return _recordIdMap != null; } + public boolean isIndexCapacityThresholdBreached() { Review Comment: Suggest name it `canAddMore()` to be consistent with the method in each index ########## pinot-spi/src/main/java/org/apache/pinot/spi/config/table/IndexingConfig.java: ########## @@ -410,6 +411,14 @@ public void setSegmentNameGeneratorType(String segmentNameGeneratorType) { _segmentNameGeneratorType = segmentNameGeneratorType; } + public boolean isIndexCapacityThresholdCheckEnabled() { + return _indexCapacityThresholdCheckEnabled; + } + + public void setIndexCapacityThresholdCheckEnabled(boolean indexCapacityThresholdCheckEnabled) { + _indexCapacityThresholdCheckEnabled = indexCapacityThresholdCheckEnabled; + } + Review Comment: Given the overhead is very low, I'd suggest making this on by default, or simply remove this flag. We didn't run into this issue often because it is extremely rare to have more than 450M values in a segment. Before this PR, we usually manually reduce the commit threshold to workaround this, which is not ideal, and the user experience is very bad. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org