Re: [I] Why doesn't RRF handle tied scores with equal ranking instead of using positional ranking? [lucene]

2025-06-11 Thread via GitHub
jpountz commented on issue #14769: URL: https://github.com/apache/lucene/issues/14769#issuecomment-2964001810 I'm happy to see this API being used as it was only added in the last minor release. The change that you are suggested makes sense to me. I'd like to then use doc IDs as a tie-break

Re: [PR] Fix IndexSortSortedNumericDocValuesRangeQuery for int sort (#14732) [lucene]

2025-06-11 Thread via GitHub
dsmiley commented on PR #14736: URL: https://github.com/apache/lucene/pull/14736#issuecomment-2964446601 The CHANGES.txt addition should go under 9.12.2, not 9.12.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [I] Why doesn't RRF handle tied scores with equal ranking instead of using positional ranking? [lucene]

2025-06-11 Thread via GitHub
benwtrent commented on issue #14769: URL: https://github.com/apache/lucene/issues/14769#issuecomment-2962676829 > However, when all documents have identical scores (such as search results on keyword fields), this approach could lead to incorrect RRF scores. I don't think this makes th

Re: [I] Why doesn't RRF handle tied scores with equal ranking instead of using positional ranking? [lucene]

2025-06-11 Thread via GitHub
hellosunil commented on issue #14769: URL: https://github.com/apache/lucene/issues/14769#issuecomment-2963050158 I agree that using internal doc ID as a tie-breaker for sorting documents with identical scores within a single query result is reasonable. However, I'm concerned about a specifi

Re: [PR] Fix map size in CustomAnalyzer#paramsToMap. [lucene]

2025-06-11 Thread via GitHub
github-actions[bot] commented on PR #14770: URL: https://github.com/apache/lucene/pull/14770#issuecomment-2962564613 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

[PR] Fix map size in CustomAnalyzer#paramsToMap. [lucene]

2025-06-11 Thread via GitHub
vsop-479 opened a new pull request, #14770: URL: https://github.com/apache/lucene/pull/14770 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Improve hnsw on heap ram est [lucene]

2025-06-11 Thread via GitHub
ChrisHegarty commented on code in PR #14765: URL: https://github.com/apache/lucene/pull/14765#discussion_r2140477487 ## lucene/core/src/test/org/apache/lucene/util/hnsw/HnswGraphTestCase.java: ## @@ -786,6 +786,7 @@ public void testHnswGraphBuilderInvalid() throws IOException {

Re: [PR] Improve hnsw on heap ram est [lucene]

2025-06-11 Thread via GitHub
benwtrent commented on code in PR #14765: URL: https://github.com/apache/lucene/pull/14765#discussion_r2140506765 ## lucene/core/src/test/org/apache/lucene/util/hnsw/HnswGraphTestCase.java: ## @@ -786,6 +786,7 @@ public void testHnswGraphBuilderInvalid() throws IOException {

Re: [PR] [BlockJoin] Add ParentsChildrenBlockJoinQuery to support parent and c… [lucene]

2025-06-11 Thread via GitHub
mkhludnev commented on PR #14728: URL: https://github.com/apache/lucene/pull/14728#issuecomment-2962256207 @msfroh I still think that extending existing `ToChildBlockJoinQuery` with this functionality is preferable, at least from the code maintenance standpoint. However, I can't contribute

Re: [PR] Improve hnsw on heap ram est [lucene]

2025-06-11 Thread via GitHub
benwtrent merged PR #14765: URL: https://github.com/apache/lucene/pull/14765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [I] Why doesn't RRF handle tied scores with equal ranking instead of using positional ranking? [lucene]

2025-06-11 Thread via GitHub
hellosunil commented on issue #14769: URL: https://github.com/apache/lucene/issues/14769#issuecomment-2965086174 You're absolutely right about the complexity with field-sorted results. After thinking about it more, I agree with your second suggestion - we should document that this RRF helpe