gortiz commented on issue #12949:
URL: https://github.com/apache/pinot/issues/12949#issuecomment-2079590396

   I was able to replicate a similar plan in  MultiStageQuickStart using:
   
   ```sql
   select 
     playerID,
     numberOfGamesAsBatter,
     playerName,
     runs,
     intentionalWalks,
     CAST(hits as BIGINT) as Zeit
   from baseballStats a
     join dimBaseballTeams b on a.teamID = b.teamID
     where not exists ( select 1 from baseballStats c where c.teamID = a.teamID 
and c.yearID < a.yearID)
   order by Zeit desc
   ```
   
   Which generates
   ```
   Execution Plan
   LogicalSort(sort0=[$5], dir0=[DESC])
     PinotLogicalSortExchange(distribution=[hash], collation=[[5 DESC]], 
isSortOnSender=[false], isSortOnReceiver=[true])
       LogicalProject(playerID=[$3], numberOfGamesAsBatter=[$2], 
playerName=[$4], runs=[$5], intentionalWalks=[$1], Zeit=[CAST($0):BIGINT NOT 
NULL])
         LogicalFilter(condition=[IS NULL($11)])
           LogicalJoin(condition=[AND(=($6, $9), =($7, $10))], joinType=[left])
             PinotLogicalExchange(distribution=[hash[6, 7]])
               LogicalJoin(condition=[=($6, $8)], joinType=[inner])
                 PinotLogicalExchange(distribution=[hash[6]])
                   LogicalProject(hits=[$9], intentionalWalks=[$12], 
numberOfGamesAsBatter=[$15], playerID=[$16], playerName=[$17], runs=[$19], 
teamID=[$25], yearID=[$27])
                     LogicalTableScan(table=[[default, baseballStats]])
                 PinotLogicalExchange(distribution=[hash[0]])
                   LogicalProject(teamID=[$3])
                     LogicalTableScan(table=[[default, dimBaseballTeams]])
             PinotLogicalExchange(distribution=[hash[0, 1]])
               LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
                 PinotLogicalExchange(distribution=[hash[0, 1]])
                   LogicalAggregate(group=[{0, 1}], agg#0=[MIN($2)])
                     LogicalProject(teamID0=[$2], yearID0=[$3], $f0=[true])
                       LogicalJoin(condition=[AND(=($0, $2), <($1, $3))], 
joinType=[inner])
                         PinotLogicalExchange(distribution=[hash[0]])
                           LogicalProject(teamID=[$25], yearID=[$27])
                             LogicalTableScan(table=[[default, baseballStats]])
                         PinotLogicalExchange(distribution=[hash[0]])
                           LogicalAggregate(group=[{0, 1}])
                             PinotLogicalExchange(distribution=[hash[0, 1]])
                               LogicalAggregate(group=[{0, 1}])
                                 LogicalJoin(condition=[=($0, $2)], 
joinType=[inner])
                                   PinotLogicalExchange(distribution=[hash[0]])
                                     LogicalProject(teamID=[$25], yearID=[$27])
                                       LogicalTableScan(table=[[default, 
baseballStats]])
                                   PinotLogicalExchange(distribution=[hash[0]])
                                     LogicalProject(teamID=[$3])
                                       LogicalTableScan(table=[[default, 
dimBaseballTeams]])
   ```
   
   But at lest with the numbers we have in this quickstart, the rows returned 
by this version are smaller than the number returned without the where.
   
   I'll take a look next week trying to understand the rules that have been 
applied to generate that plan.
   
   Meanwhile, can you try to use LAST_WITH_TIME instead? IICU your query, that 
should be semantically equivalent to:
   
   ```
   select a.acctuniquesessionid,
     a.acctstatustype,
     a.ts,
     b.acctstatustype,
     b.ts,
     (cast(b.ts as BIGINT) - cast(a.ts as BIGINT)) / 1000 as Zeit
   from (
       select acctuniquesessionid,
         LAST_WITH_TIME(acctstatustype, ts, "STRING") as acctstatustype,
         MAX(ts) as ts
       from radius_start
       group by acctuniquesessionid
     ) as a
     join radius_stop b on a.acctuniquesessionid = b.acctuniquesessionid
   order by Zeit desc
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to