gortiz opened a new pull request, #15071:
URL: https://github.com/apache/pinot/pull/15071

   We detected that `cancel` and `earlyTerminate` methods were not actually 
interrupting SSE threads. Specifically, in 
`LeafStageTransferableBlockOperatorTest`, newly added tests 
`cancelMethodInterruptsSseTasks` and `earlyTerminateMethodInterruptsSseTasks` 
would fail in master.
   
   This, combined by the fact that the OpChain (and therefore their MSE 
operators) were not closed when error blocks are received, means that SSE 
threads were not interrupted when there were some errors.
   
   For example, imagine the following query (that can be ran in 
`ColocatedJoinEngineQuickStart`):
   ```sql
   select * 
   from userGroups as g1 
   cross join (
     select * 
     from userAttributes as a1
     cross join userAttributes as a2
   )
   limit 10
   ```
   
   This query will always fail because the inner cross join exceeds the max 
number of rows in a join. But in master, the leaf stage that scans `userGroups` 
will not be canceled and will run until:
   1. All blocks are read
   2. Timeout is detected
   
   In the same situation in this PR, the leaf stage will be canceled in the 
following situations:
   1. The leaf stage produces at least one block, which is sent downstream. 
When it reaches the send mailbox, it notifies the early terminate, which, in 
the new code, actually interrupts the SSE threads.
   2. Timeout is detected 
   
   We can improve this mechanism further. Specifically, upstream (aka parent 
stages) could send a message to the upstream (children stages) indicating that 
the computation has been aborted. This would improve the cancellation process, 
especially when the leaf stage executes expensive work like sorting or grouping 
without ST indexes. However, it would require applying changes in the worker 
protocol. IMHO, what we should do is rethink this protocol and use reactive 
streams instead. To me, it looks like we are trying to reinvent the wheel here, 
trying to make a partial implementation of reactive streams that is not good 
enough.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to