Gabriel39 commented on code in PR #15642:
URL: https://github.com/apache/doris/pull/15642#discussion_r1071033878


##########
fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java:
##########
@@ -591,6 +591,33 @@ public void analyze(Analyzer analyzer) throws 
UserException {
         analyzeAggregation(analyzer);
         createAnalyticInfo(analyzer);
         eliminatingSortNode();
+        if (checkEnableTwoPhaseRead()) {
+            // If optimize enabled, we try our best to read less columns from 
ScanNode,
+            // here we analyze conjunct exprs and ordering exprs before 
resultExprs,
+            // rest of resultExprs will be marked as `INVALID`, such columns 
will
+            // be prevent from reading from ScanNode.Those columns will be 
finally
+            // read by the second fetch phase
+            LOG.info("two phase read optimize enabled");

Review Comment:
   ```suggestion
               LOG.debug("two phase read optimize enabled");
   ```



##########
gensrc/thrift/PlanNodes.thrift:
##########
@@ -891,6 +894,8 @@ struct TExchangeNode {
   2: optional TSortInfo sort_info
   // This is tHe number of rows to skip before returning results
   3: optional i64 offset
+  // used for second phase fetch

Review Comment:
   ```suggestion
     // Nodes in this cluster, used for second phase fetch
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java:
##########
@@ -619,6 +646,39 @@ public void analyze(Analyzer analyzer) throws 
UserException {
         }
     }
 
+    // Check whether enable two phase read optimize, if enabled query will be 
devieded into two phase read:
+    // 1. read conjuncts columns and order by columns along with an extra 
RowId column from ScanNode
+    // 2. sort and filter data, and get final RowId column, spawn RPC to other 
BE to fetch final data
+    // 3. final matrialize all data
+    public boolean checkEnableTwoPhaseRead() {
+        // Only handle the simplest `SELECT ... FROM <tbl> WHERE ... ORDER BY 
... LIMIT ...` query
+        if (getAggInfo() != null
+                || getHavingPred() != null
+                || getWithClause() != null) {
+            return false;
+        }
+        // single olap table
+        List<TableRef> tblRefs = getTableRefs();
+        if (tblRefs.size() != 1 || !(tblRefs.get(0) instanceof BaseTableRef)) {
+            return false;
+        }
+        TableRef tbl = tblRefs.get(0);
+        if (tbl.getTable().getType() != Table.TableType.OLAP) {
+            return false;
+        }
+        // need enable light schema change

Review Comment:
   Why is light schema change required?



##########
fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java:
##########
@@ -1052,6 +1060,10 @@ protected void toThrift(TPlanNode msg) {
         if (pushDownAggNoGroupingOp != null) {
             msg.olap_scan_node.setPushDownAggTypeOpt(pushDownAggNoGroupingOp);
         }
+
+        if (orderingExprs != null) {

Review Comment:
   Is this always false?



##########
fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java:
##########
@@ -619,6 +646,39 @@ public void analyze(Analyzer analyzer) throws 
UserException {
         }
     }
 
+    // Check whether enable two phase read optimize, if enabled query will be 
devieded into two phase read:
+    // 1. read conjuncts columns and order by columns along with an extra 
RowId column from ScanNode
+    // 2. sort and filter data, and get final RowId column, spawn RPC to other 
BE to fetch final data
+    // 3. final matrialize all data
+    public boolean checkEnableTwoPhaseRead() {

Review Comment:
   Add a session variable to turn off this feature



##########
fe/fe-core/src/main/java/org/apache/doris/planner/SortNode.java:
##########
@@ -142,6 +144,15 @@ public void setUseTopnOpt(boolean useTopnOpt) {
         this.useTopnOpt = useTopnOpt;
     }
 
+    public List<Expr> getResolvedTupleExprs() {
+        return resolvedTupleExprs;
+    }
+
+    public boolean isUseTopNTwoPhaseOptimize() {

Review Comment:
   ```suggestion
       public boolean useTwoPhaseTopNOptimization() {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to