Re: [PR] Specialize DisiPriorityQueue for the 2-clauses case. [lucene]

2025-01-28 Thread via GitHub
jpountz merged PR #14070: URL: https://github.com/apache/lucene/pull/14070 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Make knn graph conn writing more consistent [lucene]

2025-01-28 Thread via GitHub
benwtrent merged PR #14174: URL: https://github.com/apache/lucene/pull/14174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [I] testMergeStability failing for Knn formats [lucene]

2025-01-28 Thread via GitHub
msokolov commented on issue #13640: URL: https://github.com/apache/lucene/issues/13640#issuecomment-2619823636 Curious if you tried git bisect to see if there was any recent change that reintroduced this? -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [I] UnsupportedOperationException instead of IllegalArgumentException from PointInSetQuery when values are out of order [lucene]

2025-01-28 Thread via GitHub
jhinch-at-atlassian-com commented on issue #14161: URL: https://github.com/apache/lucene/issues/14161#issuecomment-2620096721 I should be able to contribute a PR. Just need to finalise details around the CLA and then I will raise a PR -- This is an automated message from the Apache Git Se

Re: [PR] Upgrade OpenNLP from 2.3.2 to 2.5.3 [lucene]

2025-01-28 Thread via GitHub
mawiesne commented on PR #14130: URL: https://github.com/apache/lucene/pull/14130#issuecomment-2620095679 @msfroh Anything open or preventing a merge of this PR to main/10.x ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Upgrade OpenNLP from 2.3.2 to 2.5.3 [lucene]

2025-01-28 Thread via GitHub
msfroh commented on PR #14130: URL: https://github.com/apache/lucene/pull/14130#issuecomment-2620133425 > @msfroh Anything open or preventing a merge of this PR to main/10.x ? I don't think so. It just needs to be merged by a Lucene committer. (I don't have permission.) @dweiss

Re: [I] testMergeStability failing for Knn formats [lucene]

2025-01-28 Thread via GitHub
benwtrent commented on issue #13640: URL: https://github.com/apache/lucene/issues/13640#issuecomment-2620151180 Interesting, the randomized case isn't anything special. Its just a plain 'ole Lucene99Hnsw index. No quantization or anything :/ -- This is an automated message from the Apache

Re: [PR] Upgrade OpenNLP from 2.3.2 to 2.5.3 [lucene]

2025-01-28 Thread via GitHub
msokolov commented on PR #14130: URL: https://github.com/apache/lucene/pull/14130#issuecomment-2620188700 I can merge -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Upgrade OpenNLP from 2.3.2 to 2.5.3 [lucene]

2025-01-28 Thread via GitHub
msokolov merged PR #14130: URL: https://github.com/apache/lucene/pull/14130 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [I] Upgrade to OpenNLP 2.5.x [lucene]

2025-01-28 Thread via GitHub
msokolov closed issue #14029: Upgrade to OpenNLP 2.5.x URL: https://github.com/apache/lucene/issues/14029 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail

Re: [PR] Adjust knn merge stability testing [lucene]

2025-01-28 Thread via GitHub
msokolov commented on code in PR #14172: URL: https://github.com/apache/lucene/pull/14172#discussion_r1932955109 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/RandomCodec.java: ## @@ -191,6 +193,9 @@ public PostingsFormat getPostingsFormatForField(String name)

Re: [PR] Adjust knn merge stability testing [lucene]

2025-01-28 Thread via GitHub
benwtrent commented on code in PR #14172: URL: https://github.com/apache/lucene/pull/14172#discussion_r1932966932 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/RandomCodec.java: ## @@ -191,6 +193,9 @@ public PostingsFormat getPostingsFormatForField(String name

Re: [PR] Upgrade opennlp and codec 10x [lucene]

2025-01-28 Thread via GitHub
msokolov merged PR #14177: URL: https://github.com/apache/lucene/pull/14177 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

[PR] Upgrade opennlp and codec 10x [lucene]

2025-01-28 Thread via GitHub
msfroh opened a new pull request, #14177: URL: https://github.com/apache/lucene/pull/14177 ### Description This change backports https://github.com/apache/lucene/commit/a7b7f0d6583c5532337320efee71d4797f473b60 (Upgrade OpenNLP from 2.3.2 to 2.5.3) and https://github.com/apache/lucen

Re: [PR] Add knn result consistency test [lucene]

2025-01-28 Thread via GitHub
msokolov commented on PR #14167: URL: https://github.com/apache/lucene/pull/14167#issuecomment-2619481888 I was thinking of another approach based on pro-rating. On its own this is deterministic and close to optimally efficient, but risks missing the best results when the index is skewed. I

Re: [PR] Add a HNSW collector that exits early when nearest neighbor queue saturates [lucene]

2025-01-28 Thread via GitHub
tteofili commented on code in PR #14094: URL: https://github.com/apache/lucene/pull/14094#discussion_r1932381957 ## lucene/core/src/java/org/apache/lucene/search/HnswKnnCollector.java: ## @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [I] testMergeStability failing for Knn formats [lucene]

2025-01-28 Thread via GitHub
msokolov commented on issue #13640: URL: https://github.com/apache/lucene/issues/13640#issuecomment-2619990074 I did the git bisect dance and found this test seed starts failing with [Randomize KnnVector codec params in RandomCodec](https://github.com/apache/lucene/commit/6b0112cdee284cae91

Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

2025-01-28 Thread via GitHub
vigyasharma commented on PR #14173: URL: https://github.com/apache/lucene/pull/14173#issuecomment-2620016142 Ran some early benchmarks to compare this flat storage based multi-vector approach with the existing parent-join approach. I would appreciate any feedback on the approach, benchmark

Re: [PR] [WIP] Multi-Vector support for HNSW search [lucene]

2025-01-28 Thread via GitHub
vigyasharma commented on PR #13525: URL: https://github.com/apache/lucene/pull/13525#issuecomment-2620021231 I pivoted to an approach that handles independent multi-vectors within flat storage, instead of requiring index time parent-block joins. Have raised a draft PR here – #14173 -

Re: [PR] Upgrade OpenNLP from 2.3.2 to 2.5.3 [lucene]

2025-01-28 Thread via GitHub
mawiesne commented on PR #14130: URL: https://github.com/apache/lucene/pull/14130#issuecomment-2620855945 > I can merge Thx @msokolov -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s