walterddr opened a new issue, #9944:
URL: https://github.com/apache/pinot/issues/9944

   see:
   ```
   SELECT 
     count(LO_ORDERKEY)
   FROM 
     lineorder JOIN customer ON LO_CUSTKEY = C_CUSTKEY
   WHERE 
     LO_QUANTITY < 10 AND C_REGION = 'AFRICA'
   
   LogicalAggregate(group=[{}], EXPR$0=[$SUM0($0)]): rowcount = 1.0, cumulative 
cost = 17819.155154440043, id = 44195 
     LogicalExchange(distribution=[hash]): rowcount = 1.0, cumulative cost = 
17818.030154440043, id = 44193 
       LogicalAggregate(group=[{}], EXPR$0=[COUNT()]): rowcount = 1.0, 
cumulative cost = 17814.030154440043, id = 44188 
         LogicalJoin(condition=[=($7, $30)], joinType=[inner]): rowcount = 
112.5, cumulative cost = 17812.905154440043, id = 44186  
           LogicalExchange(distribution=[hash[7]]): rowcount = 50.0, cumulative 
cost = 15798.092021712584, id = 44181 
             LogicalFilter(condition=[<($2, 10)]): rowcount = 50.0, cumulative 
cost = 150.0, id = 44173 
               LogicalTableScan(table=[[lineorder]]): rowcount = 100.0, 
cumulative cost = 100.0, id = 44157 
           LogicalExchange(distribution=[hash[10]]): rowcount = 15.0, 
cumulative cost = 1902.3131327274587, id = 44182 
             LogicalFilter(condition=[=($7, 'AFRICA')]): rowcount = 15.0, 
cumulative cost = 115.0, id = 44174 
               LogicalTableScan(table=[[customer]]): rowcount = 100.0, 
cumulative cost = 100.0, id = 44159
   ```
   The explained plan requires leaf stages to return the entire collection of 
columns 
   
   where as:
   ```
   SELECT count(LO_ORDERKEY)
   FROM (SELECT LO_ORDERKEY, LO_CUSTKEY FROM lineorder WHERE LO_QUANTITY < 10) 
AS l JOIN 
        (SELECT C_CUSTKEY FROM customer WHERE  C_REGION = 'AFRICA') AS c ON 
l.LO_CUSTKEY = c.C_CUSTKEY
   
   
   LogicalAggregate(group=[{}], EXPR$0=[$SUM0($0)]): rowcount = 1.0, cumulative 
cost = 2176.042214237391, id = 44268 
     LogicalExchange(distribution=[hash]): rowcount = 1.0, cumulative cost = 
2174.917214237391, id = 44266 
       LogicalAggregate(group=[{}], EXPR$0=[COUNT()]): rowcount = 1.0, 
cumulative cost = 2170.917214237391, id = 44261 
         LogicalJoin(condition=[=($1, $2)], joinType=[inner]): rowcount = 
112.5, cumulative cost = 2169.792214237391, id = 44259  <-- NOTE: this is 
condition =($1, $2)
           LogicalExchange(distribution=[hash[1]]): rowcount = 50.0, cumulative 
cost = 1764.8092021712584, id = 44254 
             LogicalProject(LO_ORDERKEY=[$0], LO_CUSTKEY=[$7]): rowcount = 
50.0, cumulative cost = 200.0, id = 44249 
               LogicalFilter(condition=[<($2, 10)]): rowcount = 50.0, 
cumulative cost = 150.0, id = 44239 
                 LogicalTableScan(table=[[lineorder]]): rowcount = 100.0, 
cumulative cost = 100.0, id = 44198 
                LogicalExchange(distribution=[hash[0]]): rowcount = 15.0, 
cumulative cost = 292.4830120661326, id = 44255 
                  LogicalProject(C_CUSTKEY=[$10]): rowcount = 15.0, cumulative 
cost = 130.0, id = 44251 
                    LogicalFilter(condition=[=($7, 'AFRICA')]): rowcount = 
15.0, cumulative cost = 115.0, id = 44244 
                      LogicalTableScan(table=[[customer]]): rowcount = 100.0, 
cumulative cost = 100.0, id = 44202
   ```
   if we wrapped in sub-query, it properly put the LogicalProject in-place to 
only return the needed columns.
   
   This should be done automatically - if a count is only operating on one of 
the columns, no others should be returned via the exchange.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to