jackjlli commented on a change in pull request #6361:
URL: https://github.com/apache/incubator-pinot/pull/6361#discussion_r568138689



##########
File path: 
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/HelixInstanceDataManager.java
##########
@@ -326,6 +339,11 @@ public SegmentMetadata getSegmentMetadata(String 
tableNameWithType, String segme
     }
   }
 
+  @Override
+  public Set<String> getColumnNamesByTable(String tableNameWithType) {
+    return _tableColumnNamesMap.computeIfAbsent(tableNameWithType, k -> 
Collections.emptySet());

Review comment:
       I don't quite fully understand the meaning of testing the overhead of 
computing the map once per query. The table columns are fetched from the schema 
which is stored in Zookeeper, and they are used to detect the column name 
mismatch during query execution. If schema fetch is done once per query, the ZK 
access would be a bottleneck for query execution. Plus, it'd be good to stick 
to one behavior in all different circumstances. Having a cache for high qps 
usecase fabric while directly fetching once per query for large MT cluster 
would make the code more difficult to maintain.
   
   `_tableColumnNamesMap` would stay the same state with 
`_tableDataManagerMap`. This is to make sure that the behavior of these two 
object are always consistent. And I don't see `_tableDataManagerMap` drops any 
reference of dataManager, until the server is shutting down.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to