amrishlal commented on a change in pull request #7959: URL: https://github.com/apache/pinot/pull/7959#discussion_r792997931
########## File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java ########## @@ -1487,6 +1498,46 @@ static void updateColumnNames(String rawTableName, PinotQuery pinotQuery, boolea } } + private static void expandStarExpressionToActualColumns(PinotQuery pinotQuery, Map<String, String> columnNameMap, + Expression selectStarExpr) { + List<Expression> originalSelections = pinotQuery.getSelectList(); + // Avoid using stream apis in query path because we have found that it has poorer performance compared to + // regular apis. + Set<String> originallySelectedColumnNames = new HashSet<>(); + for (Expression originalSelection : originalSelections) { + if (originalSelection.isSetIdentifier()) { + originallySelectedColumnNames.add(originalSelection.getIdentifier().getName()); + } + } + List<Expression> newSelections = new ArrayList<>(); + for (Expression selection : originalSelections) { + if (selection.equals(selectStarExpr)) { + //expand '*' to actual columns, exclude default virtual columns + for (String tableCol : columnNameMap.values()) { + //we exclude default virtual columns and those columns that are already a part of originalSelections (to + // dedup columns that are selected multiple times) Review comment: @Jackie-Jiang mysql doesn't dudup in any of the cases listed above, except that multiple stars are not allowed in the select list. I don't have access to postgresql, but I believe (as @suddendust tested earlier) the behavior is the same as mysql except that postgresql allows multiple stars in the select list. The w3cschools console might be custom UI logic (note how they change column name for query `SELECT CustomerID, CustomerID, FROM Customers` without the use of aliasing) . Nothing wrong with deduding per se and either approach would work, but I think we need to maintain consistency. For example: * if we are deduping select list, then shouldn't `select *, *` be also deduped (w3c sql console is doing dudupe here)? * if we decide to dedup `select *, *` then shouldn't `select playerID, playerID` (existing functionality which is not being currently deduped) also be deduced? * what about `select playerID, *, playerID` - how will dudupe work in this case? (since we currently don't dedupe `playerID, playerID` but will dedup `playerID, *`) * I am wondering if we have an example of an actual commercial database that is doing dedup of select list? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org