xtrntr opened a new issue #8089: URL: https://github.com/apache/pinot/issues/8089
## Description `userid` is an INT column with a cardinality of >8M, from 1 to 8,XXX,XXX TABLE has total of 3,192,209,696 rows. TABLE has 70 segments distributed over 4 servers running on Pinot with Kubernetes, 10 cores and 35GB memory each (16GB xms/xmx). Using SSD for disk. ## Queries to replicate issue `SELECT DISTINCTCOUNT(userid) FROM $TABLE WHERE userid BETWEEN 7000001 AND 8000000` yields 1,000,000 rows as expected `SELECT userid, count(*) FROM $TABLE WHERE userid BETWEEN 7000001 AND 8000000 GROUP BY userid LIMIT 10000000` yields 883,574 results `SELECT userid, count(*) FROM $TABLE WHERE userid BETWEEN 7000001 AND 8000000 GROUP BY userid LIMIT 10000000 OPTION(minServerGroupTrimSize=-1)` yields 883,574 results ## Configurations ``` # Explicitly set pinot.server.query.executor.num.groups.limit=10000000 # 10 million # Not set (using default) pinot.broker.enable.query.limit.override pinot.broker.query.response.limit ``` ### `index_map` ``` userid.dictionary.startOffset = 0 userid.dictionary.size = 2678308 userid.forward_index.startOffset = 2678308 userid.forward_index.size = 5356608 ``` ### `metadata.properties` ``` column.userid.cardinality = 669575 column.userid.totalDocs = 48582356 column.userid.dataType = INT column.userid.bitsPerElement = 20 column.userid.lengthOfEachEntry = 0 column.userid.columnType = DIMENSION column.userid.isSorted = true column.userid.hasNullValue = false column.userid.hasDictionary = true column.userid.textIndexType = NONE column.userid.hasInvertedIndex = true column.userid.hasFSTIndex = false column.userid.hasJsonIndex = false column.userid.isSingleValues = true column.userid.maxNumberOfMultiValues = 0 column.userid.totalNumberOfEntries = 48582356 column.userid.isAutoGenerated = false column.userid.minValue = 7296465 column.userid.maxValue = 8119811 column.userid.defaultNullValue = -2147483648 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org