Jackie-Jiang commented on code in PR #11824: URL: https://github.com/apache/pinot/pull/11824#discussion_r1381215533
########## pinot-query-planner/src/test/java/org/apache/pinot/query/testutils/MockRoutingManagerFactory.java: ########## @@ -63,25 +66,28 @@ public MockRoutingManagerFactory(int... ports) { _schemaMap = new HashMap<>(); _hybridTables = new HashSet<>(); _serverInstances = new HashMap<>(); + _nullHandlingMap = new HashMap<>(); _tableServerSegmentsMap = new HashMap<>(); for (int port : ports) { _serverInstances.put(toHostname(port), getServerInstance(HOST_NAME, port, port, port, port)); } } - public void registerTable(Schema schema, String tableName) { + public void registerTable(Schema schema, String tableName, boolean enableNullHandling) { Review Comment: Is the intention to generate null value vector for the table? IMO we should always generate null value vector. We will persist the null value vector only when there are null values. ########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/nullvalue/NullValueIndexType.java: ########## @@ -77,11 +81,24 @@ public String getPrettyName() { @Override public ColumnConfigDeserializer<IndexConfig> createDeserializer() { - return IndexConfigDeserializer.ifIndexingConfig( - IndexConfigDeserializer.alwaysCall((TableConfig tableConfig, Schema schema) -> - tableConfig.getIndexingConfig().isNullHandlingEnabled() - ? IndexConfig.ENABLED - : IndexConfig.DISABLED)); + return (TableConfig tableConfig, Schema schema) -> { Review Comment: > That is not correct. There is a special case for null value vector in [SegmentColumnarIndexCreator](https://github.com/apache/pinot/blob/55f29fe04dc7ed0ff18737fcec5b272969d6aba1/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java#L220) that creates the index depending on config.isNullHandlingEnabled without actually reading index configuration. Does that mean we create null value vector for all columns only when it is configured in the table config, and the column level nullable is ignored; When reading the null value vector, we do check the column level nullable flag? This behavior seems inconsistent and there is no way to only create null value vector for part of the columns. Can we make the write path logic the same as read path? i.e. check if it is enabled at column level, then decide whether to create the null value vector? ########## pinot-query-runtime/src/test/resources/queries/CountDistinct.json: ########## @@ -119,11 +119,13 @@ { "comments": "table aren't actually partitioned by val thus all segments can produce duplicate results, thus [[8]]", "sql": "SELECT SEGMENT_PARTITIONED_DISTINCT_COUNT(val) FROM {tbl1}", + "ignored": true, Review Comment: What exception did you get for these queries? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org