yihua commented on code in PR #18412:
URL: https://github.com/apache/hudi/pull/18412#discussion_r3047139363
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/BaseHoodieLogRecordReader.java:
##########
@@ -373,11 +409,17 @@ &&
compareTimestamps(logBlock.getLogBlockHeader().get(INSTANT_TIME), GREATER_THA
validBlockInstants.add(compactedFinalInstantTime);
}
}
+ Collections.reverse(validBlockInstants);
LOG.debug("Number of applied rollback blocks {}", numBlocksRolledBack);
-
+ LOG.info("Total valid instants found are {}. Instants are {}",
validBlockInstants.size(), validBlockInstants);
+ if (ignoredBlockCount > 0) {
+ LOG.info("Ignored {} log blocks from {} instants not in the range:
{}", ignoredBlockCount, ignoredInstants.size(), ignoredInstants);
Review Comment:
🤖 This INFO log fires once per file slice whenever any blocks are
range-filtered. During incremental reads over a narrow window on a large table,
you'd get one log line per file slice, each printing the full `ignoredInstants`
set. If that set grows large (many distinct instants being filtered), the
output could be noisy. Would it be worth capping the printed set or logging it
at DEBUG instead?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]