Re: [PR] Improve vector search speed by using FixedBitSet [lucene]

2024-01-05 Thread via GitHub
benwtrent closed pull request #12789: Improve vector search speed by using FixedBitSet URL: https://github.com/apache/lucene/pull/12789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Improve vector search speed by using FixedBitSet [lucene]

2023-11-15 Thread via GitHub
jpountz commented on PR #12789: URL: https://github.com/apache/lucene/pull/12789#issuecomment-1813030726 ++ This feels similar to `IndexOrDocValuesQuery`: we probably can't guess the absolute best threshold, but we can probably figure out something that is right more often than wrong. Hopef

Re: [PR] Improve vector search speed by using FixedBitSet [lucene]

2023-11-10 Thread via GitHub
benwtrent commented on PR #12789: URL: https://github.com/apache/lucene/pull/12789#issuecomment-1805943735 @jpountz searching scales logarithmically, but we do have to explore more if there are any pre-filtered nodes. We can run some experiments to determine the appropriate threshold.

Re: [PR] Improve vector search speed by using FixedBitSet [lucene]

2023-11-10 Thread via GitHub
jpountz commented on PR #12789: URL: https://github.com/apache/lucene/pull/12789#issuecomment-1805727513 Thanks, the numbers make more sense to me now. Intuitively, `FixedBitSet` performs better when a large percentage of nodes needs to be visited and `SparseFixedBitSet` performs bett

Re: [PR] Improve vector search speed by using FixedBitSet [lucene]

2023-11-09 Thread via GitHub
benwtrent commented on PR #12789: URL: https://github.com/apache/lucene/pull/12789#issuecomment-1804203048 @jpountz I re-ran my tests and double checked my numbers, I have some corrections, I accidentally double-counted sparse sizes, so previous numbers are 2x too big. GLOVE-100-100_

Re: [PR] Improve vector search speed by using FixedBitSet [lucene]

2023-11-09 Thread via GitHub
jpountz commented on PR #12789: URL: https://github.com/apache/lucene/pull/12789#issuecomment-1804146598 I can believe that FixedBitSet is faster in some cases, but it's surprising to me that the memory usage of SparseFixedBitSet can go up to 2x that of FixedBitSet, this makes me wonder if