sgup432 commented on issue #16049: URL: https://github.com/apache/lucene/issues/16049#issuecomment-4436200721
In a nutshell, lucene 9 the worst case time complexity for above case is O(N * logK) where N is the number of matching documents and K is the number of iterator. For lucene 10, worst case can be O(N * K) for a case like above where you cannot exclude docs in batches and need to go one by one. I guess the current approach would still work if the posting lists are not interleaved (e.g., clustered — type_a docs are all in one contiguous block, then type_b docs, etc.), then docIDRunEnd() or the current approach would actually help as we could exclude a lot of docs in batches, and will be faster. But with higher number of sub-iterators(like 11 in this case), the probability of interleaving is higher. So it may not be a bad idea to disable the current batch approach in case we have > 2 sub-iterators like we did above in the benchmark. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
