tarun11Mavani commented on code in PR #16344:
URL: https://github.com/apache/pinot/pull/16344#discussion_r2306242711


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/segment/SizeBasedSegmentFlushThresholdComputer.java:
##########
@@ -76,16 +76,38 @@ void setSegmentRowsToSizeRatio(double 
segmentRowsToSizeRatio) {
   synchronized void onSegmentCommit(CommittingSegmentDescriptor 
committingSegmentDescriptor,
       SegmentZKMetadata committingSegmentZKMetadata) {
     String segmentName = committingSegmentZKMetadata.getSegmentName();
-    int rowsConsumed = (int) committingSegmentZKMetadata.getTotalDocs();
-    long sizeInBytes = committingSegmentDescriptor.getSegmentSizeBytes();
+    long postCommitRows = (int) committingSegmentZKMetadata.getTotalDocs();
+    long preCommitRows = committingSegmentDescriptor.getPreCommitRowCount();
+    long postCommitSizeBytes = 
committingSegmentDescriptor.getSegmentSizeBytes();
+
+    // Use pre-commit rows if available (for commit time compaction), 
otherwise use post-commit rows
+    boolean usingPreCommitRows = preCommitRows > 0;
+    long rowsForCalculation = usingPreCommitRows ? preCommitRows : 
postCommitRows;
+
+    // Estimate pre-commit size when using pre-commit rows. If post-commit 
rows are 0 (cannot infer),
+    // fall back to using post-commit size for estimation.
+    long sizeForCalculation;

Review Comment:
   Done.



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/segment/CommittingSegmentDescriptor.java:
##########
@@ -40,6 +41,12 @@ public static CommittingSegmentDescriptor 
fromSegmentCompletionReqParams(
             reqParams.getSegmentSizeBytes());
     
committingSegmentDescriptor.setSegmentLocation(reqParams.getSegmentLocation());
     committingSegmentDescriptor.setStopReason(reqParams.getReason());
+
+    // Capture pre-commit row count from the request (for commit time 
compaction awareness)

Review Comment:
   NumRows is always available so removed the check here. 
   This PreCommitRowCount is set for all scenario. Added a check to see if we 
are using the compaction in SizeBasedSegmentFlushThresholdComputer.java



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to