lgo opened a new issue #5820: URL: https://github.com/apache/incubator-pinot/issues/5820
I've been playing around with queries and ran into an execution problem (this isn't really blocking anything, just doing toying with table setups). Specifically, it seems to happen when a query is grouping by a multi-value string dimension (`fields`) as well as a non-dictionary encoded string dimension (`id`). The setup below will reproduce it. The error doesn't happen with `"noDictionaryColumns": ["id"]`. ### Query ``` SELECT fields, id, SUM(amount) FROM testdata GROUP BY fields, id ``` ### Exception ``` QueryExecutionError: java.lang.NullPointerException at org.apache.pinot.core.operator.combine.GroupByOrderByCombineOperator.getNextBlock(GroupByOrderByCombineOperator.java:215) at org.apache.pinot.core.operator.combine.GroupByOrderByCombineOperator.getNextBlock(GroupByOrderByCombineOperator.java:62) at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49) at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:37) at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:26) at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49) at org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:48) at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:221) at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:155) at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:139) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) at shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) ``` ### Table ``` { "tableName": "testdata", "tableType": "OFFLINE", "routing": { "segmentPrunerType": "partition" }, "segmentsConfig": { "timeColumnName": "created_at", "timeType": "SECONDS", "replication": "1", "schemaName": "testdata", "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy", "segmentPushFrequency": "HOURLY", "segmentPushType": "APPEND" }, "tableIndexConfig": { "loadMode": "MMAP", "createInvertedIndexDuringSegmentGeneration": true, "invertedIndexColumns": ["fields"], "noDictionaryColumns": ["id"] }, "tenants": {}, "metadata": {} } ``` ### Schema ``` { "schemaName": "testdata", "dimensionFieldSpecs": [ { "name": "fields", "dataType": "STRING", "singleValueField": false }, { "name": "id", "dataType": "STRING" } ], "metricFieldSpecs": [ { "name": "amount", "dataType": "DOUBLE" } ], "dateTimeFieldSpecs": [ { "name": "created_at", "dataType": "LONG", "format": "1:SECONDS:EPOCH", "granularity": "15:MINUTES" } ] } ``` ### Test data (JSON) ``` {"fields": ["foo"], "id": "id_123", "amount": -1.12, "created_at": 1569293987} {"fields": ["foo", "bar"], "id": "id_987", "amount": 1.12, "created_at": 1569293987} {"fields": ["foo"], "id": "id_123", "amount": 1, "created_at": 1569293930} ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org