Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-09 Thread via GitHub
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2709371309 Thanks for feedback! > Since you wrote this, I expected tip files to become bigger, but your data suggests the opposite, tip files are getting smaller? Am I reading it correctly?

Re: [PR] Allow reading binary doc values as a RandomAccessInput [lucene]

2025-03-09 Thread via GitHub
github-actions[bot] commented on PR #13948: URL: https://github.com/apache/lucene/pull/13948#issuecomment-2709158794 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Binary vector format for flat and hnsw vectors [lucene]

2025-03-09 Thread via GitHub
lpld commented on PR #14078: URL: https://github.com/apache/lucene/pull/14078#issuecomment-2709087834 @benwtrent A short question again. Does this quantization approach in principle applicable when my data is constantly changing, i.e. new vectors are being added and old vectors removed from

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-09 Thread via GitHub
jpountz commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2709064558 The speedup on PKLookup is exciting! > IMO for tip, performance is more important than storage size, which is usually a very small part of the whole index, and loaded off-heap.

Re: [PR] Make Lucene better at skipping long runs of matches. [lucene]

2025-03-09 Thread via GitHub
jpountz commented on code in PR #14312: URL: https://github.com/apache/lucene/pull/14312#discussion_r1986310517 ## lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java: ## @@ -128,6 +128,16 @@ private void scoreWindowUsingBitSet( assert windowMatche

Re: [PR] Make Lucene better at skipping long runs of matches. [lucene]

2025-03-09 Thread via GitHub
jpountz commented on PR #14312: URL: https://github.com/apache/lucene/pull/14312#issuecomment-2708834023 I've been thinking a bit more about naming since I don't like peekNextNonMatchingDocID much, I'm thinking of renaming to `docIDRunEnd` (using "run" as in "run-length encoding"). I like i

Re: [PR] Make Lucene better at skipping long runs of matches. [lucene]

2025-03-09 Thread via GitHub
jpountz commented on code in PR #14312: URL: https://github.com/apache/lucene/pull/14312#discussion_r1986310180 ## lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java: ## @@ -128,6 +128,16 @@ private void scoreWindowUsingBitSet( assert windowMatche