Re: [PR] Fix TestHnswByteVectorGraph.testSortedAndUnsortedIndicesReturnSameResults [lucene]

2024-05-15 Thread via GitHub
timgrein commented on PR #13361: URL: https://github.com/apache/lucene/pull/13361#issuecomment-2111804179 > eventually we are going to stop searching in the graph altogether and just brute force, which ruins the reason for the test Makes sense, decreased `k` again to `60` 👍 -- Thi

Re: [PR] Disjunction as CompetitiveIterator for numeric dynamic pruning [lucene]

2024-05-15 Thread via GitHub
gf2121 commented on code in PR #13221: URL: https://github.com/apache/lucene/pull/13221#discussion_r1601216525 ## lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java: ## @@ -405,5 +395,278 @@ public int advance(int target) throws IOException { p

Re: [PR] Disjunction as CompetitiveIterator for numeric dynamic pruning [lucene]

2024-05-15 Thread via GitHub
gf2121 commented on code in PR #13221: URL: https://github.com/apache/lucene/pull/13221#discussion_r1601216937 ## lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java: ## @@ -207,102 +208,91 @@ private void updateCompetitiveIterator() throws IOExcept

Re: [PR] Disjunction as CompetitiveIterator for numeric dynamic pruning [lucene]

2024-05-15 Thread via GitHub
gf2121 commented on code in PR #13221: URL: https://github.com/apache/lucene/pull/13221#discussion_r1601218108 ## lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java: ## @@ -405,5 +395,278 @@ public int advance(int target) throws IOException { p

Re: [PR] Make `IndexInput#prefetch` take an offset. [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty commented on PR #13363: URL: https://github.com/apache/lucene/pull/13363#issuecomment-2111961244 ++ this absolute addressing is much nicer. We could even add `prefetch` to the RandomAccessInput interface, though I'm not sure how much that is used. -- This is an automated mess

[PR] Add a double addressing vector scorer [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty opened a new pull request, #13370: URL: https://github.com/apache/lucene/pull/13370 This commit adds a method to RandomVectorScorerSupplier that allows to score two vectors based their ordinals. The existing model of this API first creates a scorer, that effectively bind

Re: [PR] Add a double addressing vector scorer [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty commented on code in PR #13370: URL: https://github.com/apache/lucene/pull/13370#discussion_r1601372789 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorScorer.java: ## @@ -291,6 +291,11 @@ public RandomVectorScorer scorer(int o

[I] Reproducible failure org.apache.lucene.search.TestBlockMaxConjunction [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty opened a new issue, #13371: URL: https://github.com/apache/lucene/issues/13371 ``` ./gradlew test --tests TestBlockMaxConjunction.testRandom -Dtests.seed=C9FE523C4E733438 -Dtests.locale=ln -Dtests.timezone=Asia/Damascus -Dtests.asserts=true -Dtests.file.encoding=UTF-8

Re: [PR] Add a double addressing vector scorer [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty commented on PR #13370: URL: https://github.com/apache/lucene/pull/13370#issuecomment-2112455983 > One thing the Scorer object gave us is caching of the single vector that is used many times. > > The underlying Offheap vector objects cache the vector on heap and prevents

[I] Reproducible failure TestOrdinalMap.testRamBytesUsed [lucene]

2024-05-15 Thread via GitHub
easyice opened a new issue, #13372: URL: https://github.com/apache/lucene/issues/13372 ### Description bisect shows 55ca9f76a06ba025a98cb297db52e21537c55d14 is the first bad commit ``` > java.lang.AssertionError: expected:<976> but was:<960> > at __rando

Re: [I] NRT failure due to FieldInfo & File mismatch [lucene]

2024-05-15 Thread via GitHub
benwtrent closed issue #13353: NRT failure due to FieldInfo & File mismatch URL: https://github.com/apache/lucene/issues/13353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Fix weird NRT bug #13353 [lucene]

2024-05-15 Thread via GitHub
benwtrent merged PR #13369: URL: https://github.com/apache/lucene/pull/13369 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Use `IndexInput#prefetch` for terms dictionary lookups. [lucene]

2024-05-15 Thread via GitHub
jpountz commented on PR #13359: URL: https://github.com/apache/lucene/pull/13359#issuecomment-2112625165 I iterated a bit on this change: - `TermsEnum#prepareSeekExact` is introduced, which only prefetches data which is later going to be needed by `TermsEnum#seekExact`. - `TermStates

[I] DataOutput.writeGroupVInts throws IntegerOverflow exception during merging [lucene]

2024-05-15 Thread via GitHub
iamsanjay opened a new issue, #13373: URL: https://github.com/apache/lucene/issues/13373 ### Description As being discussed on email list that `DataOutput.writeGroupVInts` throws as IntegerOverflow exception. The goal is to find out the main reason and also to improve the exception

Re: [I] DataOutput.writeGroupVInts throws IntegerOverflow exception during merging [lucene]

2024-05-15 Thread via GitHub
iamsanjay commented on issue #13373: URL: https://github.com/apache/lucene/issues/13373#issuecomment-2112843369 java.base/java.lang.Math.toIntExact(Math.java:1135) at org.apache.lucene.store.DataOutput.writeGroupVInts(DataOutput.java:354) at https://github.com/apache/lucene/blob/f

Re: [PR] Add a double addressing vector scorer [lucene]

2024-05-15 Thread via GitHub
jimczi commented on code in PR #13370: URL: https://github.com/apache/lucene/pull/13370#discussion_r1602010896 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/ScalarQuantizedVectorScorer.java: ## @@ -165,6 +169,15 @@ public float score(int node) throws IOException {

Re: [PR] Add a double addressing vector scorer [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty commented on code in PR #13370: URL: https://github.com/apache/lucene/pull/13370#discussion_r1602135463 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/DefaultFlatVectorScorer.java: ## @@ -109,6 +109,12 @@ public float score(int node) throws IOException {

Re: [PR] Add a double addressing vector scorer [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty commented on code in PR #13370: URL: https://github.com/apache/lucene/pull/13370#discussion_r1602135463 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/DefaultFlatVectorScorer.java: ## @@ -109,6 +109,12 @@ public float score(int node) throws IOException {

Re: [PR] Add a double addressing vector scorer [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty commented on code in PR #13370: URL: https://github.com/apache/lucene/pull/13370#discussion_r1602135463 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/DefaultFlatVectorScorer.java: ## @@ -109,6 +109,12 @@ public float score(int node) throws IOException {

Re: [PR] Add a double addressing vector scorer [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty commented on code in PR #13370: URL: https://github.com/apache/lucene/pull/13370#discussion_r1602135463 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/DefaultFlatVectorScorer.java: ## @@ -109,6 +109,12 @@ public float score(int node) throws IOException {

[PR] Fix bug in SQ when just a single vector present [lucene]

2024-05-15 Thread via GitHub
ChrisHegarty opened a new pull request, #13374: URL: https://github.com/apache/lucene/pull/13374 This commit fixes a corner case in the ScalarQuantizer when just a single vector is present. I ran into this when updating a test that previously passed successfully with Lucene 9.10 but fails i

Re: [PR] GITHUB-12892: Deprecate FacetsCollector#search helper methods as they internally use IndexSearcher#search(Query, Collector) APIs [lucene]

2024-05-15 Thread via GitHub
github-actions[bot] commented on PR #13334: URL: https://github.com/apache/lucene/pull/13334#issuecomment-2113686335 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [I] DataOutput.writeGroupVInts throws IntegerOverflow exception during merging [lucene]

2024-05-15 Thread via GitHub
easyice commented on issue #13373: URL: https://github.com/apache/lucene/issues/13373#issuecomment-2113980214 Sorry for missing the email list, It seems the `docDeltaBuffer` should not overflow if just reading the code, I will try to reproduce this issue, Could you show me your source cod

Re: [I] DataOutput.writeGroupVInts throws IntegerOverflow exception during merging [lucene]

2024-05-15 Thread via GitHub
JervenBolleman commented on issue #13373: URL: https://github.com/apache/lucene/issues/13373#issuecomment-2114206903 Hi @easyice, I am the original reporter on the mailing list. As the code around indexing is a bit abstracted it might be hard to follow. What I do have, is the index t