amrishlal commented on a change in pull request #6942:
URL: https://github.com/apache/incubator-pinot/pull/6942#discussion_r635595775



##########
File path: 
pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
##########
@@ -1682,6 +1685,74 @@ static void validateRequest(PinotQuery pinotQuery, int 
queryResponseLimit) {
     if (!queryOptions.isGroupByModeSQL() || 
!queryOptions.isResponseFormatSQL()) {
       throw new IllegalStateException("SQL query should always have response 
format and group-by mode set to SQL");
     }
+
+    // Validate column names from query
+    Set<String> columnsFromQuery = getColumnsFromQuery(pinotQuery);
+    if (!columnNamesFromSchema.containsAll(columnsFromQuery)) {
+      brokerMetrics.addMeteredTableValue(rawTableName, 
BrokerMeter.INVALID_COLUMN_NAMES_IN_QUERY, 1L);
+    }
+  }
+
+  /**
+   * Fetch column names from the SQL query.
+   * @param pinotQuery pinot query
+   */
+  private static Set<String> getColumnsFromQuery(PinotQuery pinotQuery) {

Review comment:
       Most queries should have valid column names, so from what I understand 
this check is only useful for those few queries that won't have invalid column 
names, yet to detect those few cases we seem to be putting the cost on all 
queries.
   
   I don't mean to say that we shouldn't do this check (semantic checking of 
the query is useful), but it would be good do it in a way where multiple checks 
(invalid column, invalid table name, a string column name being passed to a 
function that takes numerical columns - avg for example, etc) are handled 
within one or two recursive decent passes over the query tree . Minor changes 
to the query tree could also be made such as what is being done in 
`updateColumnNames` function which seems to be traversing the query tree very 
similar to this code. `updateColumnNames` and this function could be combined 
to reduce the number of passes.
   
   Also `DataSchemaSegmentPruner` (on server side) is already checking if a 
column name is present in segment. I am wondering if incrementing the metric 
there would work?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to