Re: [PR] Add mergeProgress into MergeState for abort in mergeMiddle [lucene]

2024-10-09 Thread via GitHub
github-actions[bot] commented on PR #13822: URL: https://github.com/apache/lucene/pull/13822#issuecomment-2403660295 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403632181 I cherry-pick'd to 10.x as well. Let's see if CI builds are happy ... if so, I think this is done! -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403616445 OK I pushed 10.0.x fix ... working on 10.x now. Gradle sees the same weird deadlock when I try to `clean` and `check` in one go. So weird. -- This is an automated message from the

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403538869 Although, strangely, when I run `./gradlew clean check` I get this very odd (never seen this before) failure: ``` Unable to make progress running work. The following items ar

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403535561 The `git cherry-pick` of the 9.12.x commit merged cleanly onto 10.0.x branch ... and `gradle check` is happy, so I plan to push that cherry-pick to 10.0.x branch (and 10.x after that)

Re: [PR] Make DirectMonotonicReader.Meta more compact [lucene]

2024-10-09 Thread via GitHub
original-brownbear commented on PR #13864: URL: https://github.com/apache/lucene/pull/13864#issuecomment-2403528442 Never mind the above, I think I found a neat solution :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] Lucene 9.12 fails reading older versions of Lucene99HnswScalarQuantizedVectorsFormat [lucene]

2024-10-09 Thread via GitHub
mikemccand closed issue #13867: Lucene 9.12 fails reading older versions of Lucene99HnswScalarQuantizedVectorsFormat URL: https://github.com/apache/lucene/issues/13867 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [I] HNSW BWC tests should validate that the `int8_hnsw` zip filed index actually uses `int7` quantization [lucene]

2024-10-09 Thread via GitHub
mikemccand closed issue #13880: HNSW BWC tests should validate that the `int8_hnsw` zip filed index actually uses `int7` quantization URL: https://github.com/apache/lucene/issues/13880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] HNSW BWC tests should validate that the `int8_hnsw` zip filed index actually uses `int7` quantization [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on issue #13880: URL: https://github.com/apache/lucene/issues/13880#issuecomment-2403515707 Fixed by #13874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [I] Lucene 9.12 fails reading older versions of Lucene99HnswScalarQuantizedVectorsFormat [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on issue #13867: URL: https://github.com/apache/lucene/issues/13867#issuecomment-2403515200 Fixed by #13874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand merged PR #13874: URL: https://github.com/apache/lucene/pull/13874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403494337 > I wish we had better automation here (check out release tag, build the bwc index, copy zip files to newer release, etc.). I will open a spinoff issue... Actually, this nice `d

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403490011 OK I'll merge shortly once GitHub action is happy! Thanks everyone, phew! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403477991 > Oh! Is there an int8_hnsw.9.12.0.zip that can be deleted? D'oh! I'll delete. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403417121 Oh! Is there an int8_hnsw.9.12.0.zip that can be deleted? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403377531 > I was not expecting to see changes in `index.9.12.0-cfs.zip` or `index.9.12.0-no.cfs.zip`. Do you know why these are changed? Net/net it's because I regen'd all 9.12.0 bwc zip

Re: [PR] Make DirectMonotonicReader.Meta more compact [lucene]

2024-10-09 Thread via GitHub
original-brownbear commented on PR #13864: URL: https://github.com/apache/lucene/pull/13864#issuecomment-2403376299 > I suspect that the main offender are doc values terms dictionaries Jup that's it :) > and the storage overhead would likely be negligible without the heap over

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403360433 I was not expecting to see changes in `index.9.12.0-cfs.zip` or `index.9.12.0-no.cfs.zip`. Do you know why these are changed? -- This is an automated message from the Apache Git S

Re: [I] IntObjectHashMap.values().toArray() method throws ClassCastException [lucene]

2024-10-09 Thread via GitHub
dweiss commented on issue #13761: URL: https://github.com/apache/lucene/issues/13761#issuecomment-2403307475 Thank you for reporting the problem, @bugmakerr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] IntObjectHashMap.values().toArray() method throws ClassCastException [lucene]

2024-10-09 Thread via GitHub
dweiss closed issue #13761: IntObjectHashMap.values().toArray() method throws ClassCastException URL: https://github.com/apache/lucene/issues/13761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Remove broken .toArray from IntObjectHashMap entirely [lucene]

2024-10-09 Thread via GitHub
dweiss merged PR #13876: URL: https://github.com/apache/lucene/pull/13876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403280987 OK phew I think I resolved the conflicts due to stale branch, and tests are passing for me. I think this is ready? Let's let GitHub actions chew on it to confirm... -- This is an

Re: [I] Take advantage of DocValuesSkipper in IndexSortSortedNumericDocValuesRangeQuery [lucene]

2024-10-09 Thread via GitHub
BrianWoolfolk commented on issue #13840: URL: https://github.com/apache/lucene/issues/13840#issuecomment-2403266928 Great! I'll look into this then and make a PR afterwards -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403250756 Woops, I see my local `branch_9_12` (and where I branched my `fix_back_compat` branch from for this PR) was out of date -- `branch_9_12` had already bumped to 9.12.1 and added bwc ind

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2403082537 > > Ack, I'll re-revert the revert. Thanks @ChrisHegarty. > > Thanks. And sorry for the bad suggestion. No worries! > So, I think once the 9.12.0 int7 hnsw bwc file

Re: [I] HNSW BWC tests should validate that the `int8_hnsw` zip filed index actually uses `int7` quantization [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on issue #13880: URL: https://github.com/apache/lucene/issues/13880#issuecomment-2403023334 OK I got this working, and confirmed that the new test case fails on the old (`float32`) BWC indices and passes on the ones in this PR, phew. I pushed that new test case onto th

Re: [PR] Remove broken .toArray from IntObjectHashMap entirely [lucene]

2024-10-09 Thread via GitHub
dweiss commented on code in PR #13876: URL: https://github.com/apache/lucene/pull/13876#discussion_r1793964617 ## lucene/core/src/test/org/apache/lucene/internal/hppc/TestIntObjectHashMap.java: ## @@ -66,10 +69,8 @@ private static void assertSortedListEquals(int[] array, int...

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402848199 > Ack, I'll re-revert the revert. Thanks @ChrisHegarty. Thanks. And sorry for the bad suggestion. So, I think once the 9.12.0 int7 hnsw bwc file is created, then this

Re: [PR] Backport SOLR-14765 to branch_8_11 [lucene-solr]

2024-10-09 Thread via GitHub
itygh commented on PR #2682: URL: https://github.com/apache/lucene-solr/pull/2682#issuecomment-2402841726 这是来自QQ邮箱的假期自动回复邮件。您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Backport SOLR-14765 to branch_8_11 [lucene-solr]

2024-10-09 Thread via GitHub
risdenk closed pull request #2682: Backport SOLR-14765 to branch_8_11 URL: https://github.com/apache/lucene-solr/pull/2682 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402825006 > > Hmm the problem is, we need to regen the 9.12.0 `int8_hnsw` back compat index using the bug fix from this PR (so that it actually has `int7` quantized vectors, not unquantized `fl

Re: [I] HNSW BWC tests should validate that the `int8_hnsw` zip filed index actually uses `int7` quantization [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on issue #13880: URL: https://github.com/apache/lucene/issues/13880#issuecomment-2402818214 OK I see -- I am checking the wrong class -- I have to get to the underly `FlatVectorsReader` ... -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Speedup OrderedIntervalsSource [lucene]

2024-10-09 Thread via GitHub
jpountz commented on PR #13871: URL: https://github.com/apache/lucene/pull/13871#issuecomment-2402747421 Ah, nevermind, this PR was not included in the last nightly run, so we should see its effect tomorrow. -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Speedup OrderedIntervalsSource [lucene]

2024-10-09 Thread via GitHub
jpountz commented on PR #13871: URL: https://github.com/apache/lucene/pull/13871#issuecomment-2402745882 The speedup on nightlies is very small: https://benchmarks.mikemccandless.com/IntervalsOrdered.html. I wonder if it's due to a difference in hardware, or queries (nightlies have differen

Re: [PR] Make DirectMonotonicReader.Meta more compact [lucene]

2024-10-09 Thread via GitHub
jpountz commented on PR #13864: URL: https://github.com/apache/lucene/pull/13864#issuecomment-2402721041 Looking at places where we used `DirectMonotonicReader`, I suspect that the main offender are doc values terms dictionaries, which maintain two `DirectMonotonicReader`s each. However, th

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402700651 > Hmm the problem is, we need to regen the 9.12.0 `int8_hnsw` back compat index using the bug fix from this PR (so that it actually has `int7` quantized vectors, not unquantized `fl

Re: [PR] Fix flakiness issues with TestTieredMergePolicy. [lucene]

2024-10-09 Thread via GitHub
jpountz merged PR #13881: URL: https://github.com/apache/lucene/pull/13881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] TestTieredMergePolicy.testSimulateAppendOnly fails with AssertionError [lucene]

2024-10-09 Thread via GitHub
jpountz closed issue #13818: TestTieredMergePolicy.testSimulateAppendOnly fails with AssertionError URL: https://github.com/apache/lucene/issues/13818 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] TestTieredMergePolicy.testSimulateAppendOnly fails with AssertionError [lucene]

2024-10-09 Thread via GitHub
jpountz closed issue #13818: TestTieredMergePolicy.testSimulateAppendOnly fails with AssertionError URL: https://github.com/apache/lucene/issues/13818 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Take advantage of DocValuesSkipper in IndexSortSortedNumericDocValuesRangeQuery [lucene]

2024-10-09 Thread via GitHub
gsmiller commented on issue #13840: URL: https://github.com/apache/lucene/issues/13840#issuecomment-2402588570 @iverase I wonder if moving the logic is the right thing to do vs. having the optimization in both places? I suspect the most likely way `IndexSortSortedNumericDocValuesRangeQuery`

Re: [I] HNSW BWC tests should validate that the `int8_hnsw` zip filed index actually uses `int7` quantization [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on issue #13880: URL: https://github.com/apache/lucene/issues/13880#issuecomment-2402543191 Hmm, well, I coded that up, on top of my PR from #13874, with this diff: ``` raptorlake:912x[fix_back_compat]$ git diff diff --git a/lucene/backward-codecs/src/test/o

Re: [I] Take advantage of DocValuesSkipper in IndexSortSortedNumericDocValuesRangeQuery [lucene]

2024-10-09 Thread via GitHub
iverase commented on issue #13840: URL: https://github.com/apache/lucene/issues/13840#issuecomment-2402529934 I am not working on this at the moment. This issue is an spinoff of this comment from Adrien so we might want to pull the optimization to this class completly: https://githu

Re: [I] TestDefaultCodecParallelizesIO.testStoredFields fails [lucene]

2024-10-09 Thread via GitHub
jpountz closed issue #13854: TestDefaultCodecParallelizesIO.testStoredFields fails URL: https://github.com/apache/lucene/issues/13854 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] TestDefaultCodecParallelizesIO.testStoredFields fails [lucene]

2024-10-09 Thread via GitHub
jpountz closed issue #13854: TestDefaultCodecParallelizesIO.testStoredFields fails URL: https://github.com/apache/lucene/issues/13854 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Disable CFS in TestDefaultCodecParallelizesIO. [lucene]

2024-10-09 Thread via GitHub
jpountz merged PR #13875: URL: https://github.com/apache/lucene/pull/13875 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[PR] Fix flakiness issues with TestTieredMergePolicy. [lucene]

2024-10-09 Thread via GitHub
jpountz opened a new pull request, #13881: URL: https://github.com/apache/lucene/pull/13881 The two seeds at #13818 had different root causes: - The test allows the number of segments to go above the limit, only if none of the merges are legal. But there are multiple reasons why a merge

Re: [I] Take advantage of DocValuesSkipper in IndexSortSortedNumericDocValuesRangeQuery [lucene]

2024-10-09 Thread via GitHub
gsmiller commented on issue #13840: URL: https://github.com/apache/lucene/issues/13840#issuecomment-2402489157 @BrianWoolfolk this looks like a different optimization from what was done in #13592. It doesn't look like it's worked on to me (at least I don't see this optimization in the code;

Re: [I] HNSW BWC tests should validate that the `int8_hnsw` zip filed index actually uses `int7` quantization [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on issue #13880: URL: https://github.com/apache/lucene/issues/13880#issuecomment-2402478358 > It's a bit tricky, or at least non obvious to me on a quick think, how to introspect the Codec formats used in an index down to the `KnnVectorsFormat`... Actually I think

Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-10-09 Thread via GitHub
javanna commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2402464116 Thanks @dweiss !!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402452969 > if u merge in `branch_9_12`, then `dev-tools/scripts/addBackcompatIndexes.py 9.12.0` should be able to generate the compat indices. Hmm the problem is, we need to regen the 9

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402425950 > I wonder if the simplest thing here is to just avoid the renaming (from int8 to int7) in 9.x. Otherwise, I don't see how to get `addBackcompatIndexes.py` to work - since 9.12.0 sour

Re: [PR] Remove broken .toArray from IntObjectHashMap entirely [lucene]

2024-10-09 Thread via GitHub
bruno-roustant commented on code in PR #13876: URL: https://github.com/apache/lucene/pull/13876#discussion_r1793571950 ## lucene/core/src/test/org/apache/lucene/internal/hppc/TestIntObjectHashMap.java: ## @@ -66,10 +69,8 @@ private static void assertSortedListEquals(int[] array,

Re: [PR] Add tooling back on 9.10.x branch to generate int7_hnsw.9.10.zip bwc index [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on code in PR #13879: URL: https://github.com/apache/lucene/pull/13879#discussion_r1793484452 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestGenerateBwcIndices.java: ## @@ -82,6 +82,16 @@ public void testCreateSortedIndex() throws

Re: [PR] Add tooling back on 9.10.x branch to generate int7_hnsw.9.10.zip bwc index [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on code in PR #13879: URL: https://github.com/apache/lucene/pull/13879#discussion_r1793486021 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestGenerateBwcIndices.java: ## @@ -82,6 +82,16 @@ public void testCreateSortedIndex() throws

Re: [PR] Add tooling back on 9.10.x branch to generate int7_hnsw.9.10.zip bwc index [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on code in PR #13879: URL: https://github.com/apache/lucene/pull/13879#discussion_r1793484452 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestGenerateBwcIndices.java: ## @@ -82,6 +82,16 @@ public void testCreateSortedIndex() throws

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402235046 I wonder if the simplest thing here is to just avoid the renaming (from int8 to int7) in 9.x. Otherwise, I don't see how to get `addBackcompatIndexes.py` to work - since 9.12.0 sour

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402227085 Argh!! this is very odd. We cannot just update the addBackcompatIndexes.py script. Since it downloads the 9.12.0 version and tries to run gradle on it!! and of course `testCreateIn

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402218226 if u merge in `branch_9_12`, then `dev-tools/scripts/addBackcompatIndexes.py 9.12.0` should be able to generate the compat indices. -- This is an automated message from the Apache

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402214908 we need to update `dev-tools/scripts/addBackcompatIndexes.py`. ```diff diff --git a/dev-tools/scripts/addBackcompatIndexes.py b/dev-tools/scripts/addBackcompatIndexes.py

Re: [PR] Reuse Impacts instances across invocations. [lucene]

2024-10-09 Thread via GitHub
jpountz merged PR #13878: URL: https://github.com/apache/lucene/pull/13878 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Reuse Impacts instances across invocations. [lucene]

2024-10-09 Thread via GitHub
jpountz commented on PR #13878: URL: https://github.com/apache/lucene/pull/13878#issuecomment-2402145176 You are right, I pushed a change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402140838 > I can open the PR -- I still have the dev area where I got it working and generated the bwc `.zip` in this PR. OK I opened https://github.com/apache/lucene/pull/13879 -- this

[PR] Add tooling back on 9.10.x branch to generate int7_hnsw.9.10.zip bwc index [lucene]

2024-10-09 Thread via GitHub
mikemccand opened a new pull request, #13879: URL: https://github.com/apache/lucene/pull/13879 Spinoff from #13867 to add the tooling back onto `branch_9_10` in case we ever need to regenerate the `int7_hnsw.9.10.0.zip` bwc index again. -- This is an automated message from the Apache Git

Re: [PR] Allow open-ended ranges in Intervals range [lucene]

2024-10-09 Thread via GitHub
mayya-sharipova merged PR #13873: URL: https://github.com/apache/lucene/pull/13873 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lu

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402102296 > @mikemccand I can open PRs to the old versions to add the code. I did the same "copy files and run it" dance that you did because I didn't want to commit code to the old branches. B

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402095420 > I'm try to determine why a bwc test fails in the CI. Thanks @ChrisHegarty. Can you point me to the CI build that is angry? I assume you set up a 9.12.x CI build somewh

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
benwtrent commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402094834 @mikemccand I can open PRs to the old versions to add the code. I did the same "copy files and run it" dance that you did because I didn't want to commit code to the old branches. But

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
mikemccand commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402071452 > Thank you for fixing my bwc mess! It was definitely a learning experience and apparently I didn't learn enough! No worries @benwtrent -- it is crazy complicated and trappy! I

Re: [PR] Remove vector values copy() methods, moving IndexInput.clone() and temp storage into lower-level interfaces [lucene]

2024-10-09 Thread via GitHub
benwtrent commented on PR #13872: URL: https://github.com/apache/lucene/pull/13872#issuecomment-2402064518 > hm there is some functional problem with the change that yields terrible recall for quantized vectors. I'll dig and fix and see if I can beef up the unit test coverage as well.

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
benwtrent commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402052859 Ah, I see 9.12.0 is already assumed by the bwc indices! I guess that means we need to generate 9.12.0 indices for the hnsw bwc. You are correct @ChrisHegarty . -- This is an automa

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
benwtrent commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2402047972 I think you have to add it as a backcompat index in another file, but we don't do that until 9.12.1 is cut and released! Since 9.12.1 hasn't gone through the release process, according

Re: [I] Lucene 9.12 fails reading older versions of Lucene99HnswScalarQuantizedVectorsFormat [lucene]

2024-10-09 Thread via GitHub
benwtrent commented on issue #13867: URL: https://github.com/apache/lucene/issues/13867#issuecomment-2402043612 Ack, I should have added more back compat indices MORE TESTING -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Avoid performance regression by constructing lazily the PointTree in NumericComparator (#13498) [lucene]

2024-10-09 Thread via GitHub
iverase merged PR #13877: URL: https://github.com/apache/lucene/pull/13877 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Avoid performance regression by constructing lazily the PointTree in NumericComparator (#13498) [lucene]

2024-10-09 Thread via GitHub
iverase commented on PR #13877: URL: https://github.com/apache/lucene/pull/13877#issuecomment-2401917665 I add an entry in 77b8098 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Avoid performance regression by constructing lazily the PointTree in NumericComparator (#13498) [lucene]

2024-10-09 Thread via GitHub
iverase commented on PR #13877: URL: https://github.com/apache/lucene/pull/13877#issuecomment-2401883251 I don't think we need to add an entry in changes as this is part of 9.11.1 and 9.12.0? -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [I] DataInput class can't be used with delegation pattern [lucene]

2024-10-09 Thread via GitHub
dweiss closed issue #13820: DataInput class can't be used with delegation pattern URL: https://github.com/apache/lucene/issues/13820 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-10-09 Thread via GitHub
dweiss commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2401859755 I've added migration and changes entry and cherry picked on branch_10_0, branch_10x and main. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Avoid performance regression by constructing lazily the PointTree in NumericComparator (#13498) [lucene]

2024-10-09 Thread via GitHub
javanna commented on PR #13877: URL: https://github.com/apache/lucene/pull/13877#issuecomment-2401843256 Perhaps a changes entry would be good to add? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] PR 13757 follow-up: add missing with-discountOverlaps Similarity constructor variants, CHANGES.txt entries (#13845) [lucene]

2024-10-09 Thread via GitHub
javanna commented on PR #13858: URL: https://github.com/apache/lucene/pull/13858#issuecomment-2401838440 Heads up: I moved the CHANGES entry in main and branch_10x to the 10.0.0 section. See https://github.com/apache/lucene/commit/111fc6f07857f909ad2a253cdcd6e948222ecf63 . -- This is an

Re: [PR] PR 13757 follow-up: add missing with-discountOverlaps Similarity constructor variants, CHANGES.txt entries (#13845) [lucene]

2024-10-09 Thread via GitHub
javanna commented on PR #13858: URL: https://github.com/apache/lucene/pull/13858#issuecomment-2401818106 I merged this to branch_10_0. Does this also need to backported to 9_12 for the next 9.12.1 patch release? -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] PR 13757 follow-up: add missing with-discountOverlaps Similarity constructor variants, CHANGES.txt entries (#13845) [lucene]

2024-10-09 Thread via GitHub
javanna merged PR #13858: URL: https://github.com/apache/lucene/pull/13858 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-10-09 Thread via GitHub
javanna commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2401799763 I think we are also missing a CHANGES entry , @dweiss could you take care of adding that please? -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Fix 9.12.0 backcompat break (Lucene 9.12.0 cannot read 9.11.x indices written with quantized HNSW, `Lucene99HnswScalarQuantizedVectorsFormat`) [lucene]

2024-10-09 Thread via GitHub
ChrisHegarty commented on PR #13874: URL: https://github.com/apache/lucene/pull/13874#issuecomment-2401798286 I think that this is a good change. I'm try to determine why a bwc test fails in the CI. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-10-09 Thread via GitHub
javanna commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2401796608 I get confused that this PR was not merged yet it has a milestone set. Should the milestone be set to the other PR that has been merged instead (#13830) ? -- This is an automated mess

Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-10-09 Thread via GitHub
dweiss commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2401573279 I've backported this to branch_10x and asked the RM to allow this to be cherry picked for the subsequent respin of Lucene 10.0. -- This is an automated message from the Apache Git Servi

[PR] Remove broken .toArray from IntObjectHashMap entirely [lucene]

2024-10-09 Thread via GitHub
dweiss opened a new pull request, #13876: URL: https://github.com/apache/lucene/pull/13876 This method is only used in tests. I moved the toList collection utility to the test class to minimize surface API. Fixes #13761 -- This is an automated message from the Apache Git Service.

Re: [PR] Reduce allocations in ByteBuffersDataOutput.writeString [lucene]

2024-10-09 Thread via GitHub
original-brownbear merged PR #13863: URL: https://github.com/apache/lucene/pull/13863 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...

Re: [PR] Reduce allocations in ByteBuffersDataOutput.writeString [lucene]

2024-10-09 Thread via GitHub
original-brownbear commented on PR #13863: URL: https://github.com/apache/lucene/pull/13863#issuecomment-2401511333 Thanks Adrien! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com