amrishlal commented on pull request #7678: URL: https://github.com/apache/pinot/pull/7678#issuecomment-959723661
> > The fundamental problem here is that DISTINCT should not be modeled as a function. > True. DISTINCT is a keyword in SQL that should apply to the entire output row, but not to specific values within the row. The DISTINCT query should look like: `SELECT DISTINCT a AS col1, b AS col2 FROM table.` but not `SELECT DISTINCT(a, b) AS col3 FROM table.` because the latter can be abused, as you described, to write semantically incorrect statements such as: `SELECT DISTINCT(a), b FROM table.` Internally though, there is nothing wrong in rewriting query with DISTINCT keyword into a query with DISTINCT function provided that we make sure that the DISTINCT function includes all expressions from the select list and passes the alias associated with those expressions as well. Externally in SQL, the use of DISTINCT function should either be blocked or deprecated or at the very least restricted so that one cannot specify columns or expressions in the select list that are outside the DISTINCT function if one exists in the select list. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org