morningman commented on code in PR #56117:
URL: https://github.com/apache/doris/pull/56117#discussion_r2366626252


##########
fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java:
##########
@@ -286,7 +287,26 @@ public List<Split> getSplits(int numBackends) throws 
UserException {
         // And for counting the number of selected partitions for this paimon 
table.
         Map<BinaryRow, Map<String, String>> partitionInfoMaps = new 
HashMap<>();
         // if applyCountPushdown is true, we can't split the DataSplit
-        long realFileSplitSize = getRealFileSplitSize(applyCountPushdown ? 
Long.MAX_VALUE : 0);
+        long realFileSplitSize = applyCountPushdown ? Long.MAX_VALUE : 
getRealFileSplitSize(0);
+
+        long maxExternalSplitNum = sessionVariable.getMaxExternalSplitNum();
+        Preconditions.checkState(maxExternalSplitNum > 0,
+                "max_external_split_num must be greater than 0, but got: " + 
maxExternalSplitNum);
+        long totalRawFileSize = dataSplits.stream()
+                .mapToLong(split -> split.rawConvertible()
+                        ? 
split.convertToRawFiles().get().stream().mapToLong(RawFile::fileSize).sum()
+                        : 0)
+                .sum();
+        if (totalRawFileSize > 0 && realFileSplitSize > 0) {
+            long estimatedSplitNum = totalRawFileSize / realFileSplitSize;
+            if (estimatedSplitNum > maxExternalSplitNum) {
+                realFileSplitSize = totalRawFileSize / maxExternalSplitNum;
+                LOG.info("The estimated split num is {} which exceeds the 
limit {}, "

Review Comment:
   use debug



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java:
##########
@@ -286,7 +287,26 @@ public List<Split> getSplits(int numBackends) throws 
UserException {
         // And for counting the number of selected partitions for this paimon 
table.
         Map<BinaryRow, Map<String, String>> partitionInfoMaps = new 
HashMap<>();
         // if applyCountPushdown is true, we can't split the DataSplit
-        long realFileSplitSize = getRealFileSplitSize(applyCountPushdown ? 
Long.MAX_VALUE : 0);
+        long realFileSplitSize = applyCountPushdown ? Long.MAX_VALUE : 
getRealFileSplitSize(0);
+
+        long maxExternalSplitNum = sessionVariable.getMaxExternalSplitNum();
+        Preconditions.checkState(maxExternalSplitNum > 0,

Review Comment:
   Move this check to session variable's `checker` method



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to