yihua commented on code in PR #18306:
URL: https://github.com/apache/hudi/pull/18306#discussion_r3036084259


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/ScheduleCompactionActionExecutor.java:
##########
@@ -220,6 +235,26 @@ private boolean needCompact(CompactionTriggerStrategy 
compactionTriggerStrategy)
     return compactable;
   }
 
+  /**
+   * Determines whether log compaction should be scheduled based on the number 
of delta commits
+   * since the last compaction and the last log compaction, compared against 
the
+   * {@code hoodie.log.compaction.blocks.threshold} config.
+   */
+  private boolean needLogCompact(Pair<Integer, String> 
latestDeltaCommitInfoSinceCompact) {
+    Option<Pair<Integer, String>> latestDeltaCommitInfoSinceLogCompactOption = 
getLatestDeltaCommitInfoSinceLogCompaction();
+    int numDeltaCommitsSinceLatestCompaction = 
latestDeltaCommitInfoSinceCompact.getLeft();
+    int numDeltaCommitsSinceLatestLogCompaction = 
latestDeltaCommitInfoSinceLogCompactOption.isPresent()
+        ? latestDeltaCommitInfoSinceLogCompactOption.get().getLeft()
+        : 0;
+
+    int numDeltaCommitsSince = Math.min(numDeltaCommitsSinceLatestCompaction, 
numDeltaCommitsSinceLatestLogCompaction);
+    boolean shouldLogCompact = numDeltaCommitsSince >= 
config.getLogCompactionBlocksThreshold();
+    if (shouldLogCompact) {
+      log.info("There have been {} delta commits since last compaction or log 
compaction, triggering log compaction.", numDeltaCommitsSince);

Review Comment:
   <a href="#"><img alt="P2" 
src="https://greptile-static-assets.s3.amazonaws.com/badges/p2.svg?v=7"; 
align="top"></a> **Inaccurate log message wording**
   
   `numDeltaCommitsSince` is `Math.min(sinceCompaction, sinceLogCompaction)`, 
which represents the count since the **most recent** of the two events. The 
current message "since last compaction or log compaction" is ambiguous — it 
reads as if either event is used as the reference. Consider:
   
   ```suggestion
         log.info("There have been {} delta commits since the most recent 
compaction or log compaction, triggering log compaction.", 
numDeltaCommitsSince);
   ```
   
   — *Greptile* 
([original](https://github.com/yihua/hudi/pull/11#discussion_r3036084156)) 
(source:comment#3036084156)



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/ScheduleCompactionActionExecutor.java:
##########
@@ -220,6 +235,26 @@ private boolean needCompact(CompactionTriggerStrategy 
compactionTriggerStrategy)
     return compactable;
   }
 
+  /**
+   * Determines whether log compaction should be scheduled based on the number 
of delta commits
+   * since the last compaction and the last log compaction, compared against 
the
+   * {@code hoodie.log.compaction.blocks.threshold} config.
+   */
+  private boolean needLogCompact(Pair<Integer, String> 
latestDeltaCommitInfoSinceCompact) {
+    Option<Pair<Integer, String>> latestDeltaCommitInfoSinceLogCompactOption = 
getLatestDeltaCommitInfoSinceLogCompaction();
+    int numDeltaCommitsSinceLatestCompaction = 
latestDeltaCommitInfoSinceCompact.getLeft();
+    int numDeltaCommitsSinceLatestLogCompaction = 
latestDeltaCommitInfoSinceLogCompactOption.isPresent()
+        ? latestDeltaCommitInfoSinceLogCompactOption.get().getLeft()
+        : 0;
+
+    int numDeltaCommitsSince = Math.min(numDeltaCommitsSinceLatestCompaction, 
numDeltaCommitsSinceLatestLogCompaction);
+    boolean shouldLogCompact = numDeltaCommitsSince >= 
config.getLogCompactionBlocksThreshold();

Review Comment:
   <a href="#"><img alt="P1" 
src="https://greptile-static-assets.s3.amazonaws.com/badges/p1.svg?v=7"; 
align="top"></a> **Implicit sentinel `0` breaks when 
`LogCompactionBlocksThreshold` is 0**
   
   When `latestDeltaCommitInfoSinceLogCompactOption` is empty (which happens 
when `getDeltaCommitsSinceLatestCompletedLogCompaction` detects a pending log 
compaction), `numDeltaCommitsSinceLatestLogCompaction` is set to `0`. The 
subsequent `Math.min(X, 0) = 0` then makes `0 >= threshold` evaluate to 
`false`, silently preventing a second log compaction from being scheduled on 
top of a pending one.
   
   This works correctly for any positive threshold, but if 
`config.getLogCompactionBlocksThreshold()` ever returns `0`, the expression `0 
>= 0` evaluates to `true` — causing a new log compaction to be incorrectly 
scheduled while one is already in-flight. An explicit early return is clearer 
and threshold-agnostic:
   
   ```suggestion
       Option<Pair<Integer, String>> latestDeltaCommitInfoSinceLogCompactOption 
= getLatestDeltaCommitInfoSinceLogCompaction();
       if (!latestDeltaCommitInfoSinceLogCompactOption.isPresent()) {
         // Pending log compaction in progress — do not schedule another one
         return false;
       }
       int numDeltaCommitsSinceLatestCompaction = 
latestDeltaCommitInfoSinceCompact.getLeft();
       int numDeltaCommitsSinceLatestLogCompaction = 
latestDeltaCommitInfoSinceLogCompactOption.get().getLeft();
   ```
   
   — *Greptile* 
([original](https://github.com/yihua/hudi/pull/11#discussion_r3036084148)) 
(source:comment#3036084148)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to