gortiz opened a new issue, #13126:
URL: https://github.com/apache/pinot/issues/13126

   In SQL intersect may be modified with the all modifier. For example:
   
   if table A contains a column a with values `[1,1,2,3,4]` and B contains a 
column b with values `[1, 1, 2]`:
   ```
   select * 
   from (select a from A)
   intersect (select b from B)
   ```
   Returns `[1, 2]` while
   ```
   select * 
   from (select a from A)
   intersect ALL (select b from B)
   ```
   returns `[1,1,2]`.
   
   Currently Pinot accepts the `ALL` modifier and it is shown in the explain 
plan, but the semantics are always the same. Specifically, Pinot semantics are 
the ones of intersect without all modifier.
   
   You can verify that by running `ColocatedJoinEngineQuickStart` and executing:
   ```
   select userUUID
   from (select userUUID from userGroups)
   intersect all
   (select userUUID from userGroups)
   ```
   with and without all.
   
   intersect all should return the same number of rows than `count()` (which is 
`2494`) while intersect should return the same number of rows than 
`count(distinct(userUUID))` (which is `2470`).
   
   But the returned number of rows is `2470` with and without all modifier
   
   As a short term solution we can fail if all modifier is supplied, as it 
would be better than returning incorrect results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to