saurabhd336 commented on code in PR #10254: URL: https://github.com/apache/pinot/pull/10254#discussion_r1106747320
########## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/BaseImmutableDictionary.java: ########## @@ -279,4 +281,39 @@ protected byte[] getBytes(int dictId) { protected byte[] getBuffer() { return new byte[_numBytesPerValue]; } + + public void getDictIds(List<String> values, IntSet dictIds, int inPredicateSparseThreshold, + int inPredicateSortThreshold) { + if (length() / values.size() > inPredicateSparseThreshold || values.size() < inPredicateSortThreshold) { + for (String value : values) { Review Comment: Probably a micro-optimization, but here if `values.size() > inPredicateSparseThreshold `, we know values is sorted, so if we could expose the binary search function to accept a lower bound, we can replace this with something like ``` long lastDictId = 0; for (String value : values) { int dictId = indexOf(llastDictId, value); if (dictId >= 0) { dictIds.add(dictId); lastDictId = dictId; } } ``` i.e., since values is sorted, the search space for a value can be reduced further. The binary search can accept a new param to specify start idx ``` protected int binarySearch(int lowIdx, long value) { int low = lowIdx; int high = _length - 1; while (low <= high) { int mid = (low + high) >>> 1; long midValue = _valueReader.getLong(mid); if (midValue < value) { low = mid + 1; } else if (midValue > value) { high = mid - 1; } else { return mid; } } return -(low + 1); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org