ankitsultana commented on code in PR #16344:
URL: https://github.com/apache/pinot/pull/16344#discussion_r2304475167


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/segment/CommittingSegmentDescriptor.java:
##########
@@ -40,6 +41,12 @@ public static CommittingSegmentDescriptor 
fromSegmentCompletionReqParams(
             reqParams.getSegmentSizeBytes());
     
committingSegmentDescriptor.setSegmentLocation(reqParams.getSegmentLocation());
     committingSegmentDescriptor.setStopReason(reqParams.getReason());
+
+    // Capture pre-commit row count from the request (for commit time 
compaction awareness)

Review Comment:
   nit: better to use explicit flags for the behavior you intend. derivative 
logic becomes complicated long term to understand



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/segment/SizeBasedSegmentFlushThresholdComputer.java:
##########
@@ -76,16 +76,38 @@ void setSegmentRowsToSizeRatio(double 
segmentRowsToSizeRatio) {
   synchronized void onSegmentCommit(CommittingSegmentDescriptor 
committingSegmentDescriptor,
       SegmentZKMetadata committingSegmentZKMetadata) {
     String segmentName = committingSegmentZKMetadata.getSegmentName();
-    int rowsConsumed = (int) committingSegmentZKMetadata.getTotalDocs();
-    long sizeInBytes = committingSegmentDescriptor.getSegmentSizeBytes();
+    long postCommitRows = (int) committingSegmentZKMetadata.getTotalDocs();
+    long preCommitRows = committingSegmentDescriptor.getPreCommitRowCount();
+    long postCommitSizeBytes = 
committingSegmentDescriptor.getSegmentSizeBytes();
+
+    // Use pre-commit rows if available (for commit time compaction), 
otherwise use post-commit rows
+    boolean usingPreCommitRows = preCommitRows > 0;
+    long rowsForCalculation = usingPreCommitRows ? preCommitRows : 
postCommitRows;
+
+    // Estimate pre-commit size when using pre-commit rows. If post-commit 
rows are 0 (cannot infer),
+    // fall back to using post-commit size for estimation.
+    long sizeForCalculation;

Review Comment:
   can you move this to a method. We can also add a UT for this then.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to