Re: [PR] Move synonym map off-heap for SynonymGraphFilter [lucene]

2025-02-22 Thread via GitHub
github-actions[bot] commented on PR #13054: URL: https://github.com/apache/lucene/pull/13054#issuecomment-2676466824 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Introduce DocIdStream#intoBitset to speed up cache [lucene]

2025-02-22 Thread via GitHub
jpountz commented on PR #14277: URL: https://github.com/apache/lucene/pull/14277#issuecomment-2676680926 I believe that this could help `FacetCollector` as well. Intuitively, I had thought of `BulkScorer` as a better place for this API that would more easily help the queries that you

Re: [PR] Support load per-iteration replacement of NamedSPI [lucene]

2025-02-22 Thread via GitHub
ChrisHegarty commented on PR #14275: URL: https://github.com/apache/lucene/pull/14275#issuecomment-2676407372 > Is it that this switches from "first one wins" to "last one wins"? Good q. Out of the box, the default remains unchanged - first one wins. But if overridden, allows subseq

Re: [PR] Support load per-iteration replacement of NamedSPI [lucene]

2025-02-22 Thread via GitHub
msokolov commented on PR #14275: URL: https://github.com/apache/lucene/pull/14275#issuecomment-2676371365 Is it that this switches from "first one wins" to "last one wins"? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[PR] Enhance DictionaryCompoundWordTokenFilter [lucene]

2025-02-22 Thread via GitHub
renatoh opened a new pull request, #14278: URL: https://github.com/apache/lucene/pull/14278 Adding option to consume characters if a matching word is found, and not used for further potential matches anymore. E.g. if the word "schwein" is extracted, the sub-word "wein" is not extracted any

[PR] Introduce DocIdStream#intoBitset to speed up cache [lucene]

2025-02-22 Thread via GitHub
gf2121 opened a new pull request, #14277: URL: https://github.com/apache/lucene/pull/14277 I was looking for a way to use `DocIdSetIterator#intoBitset` to speed up caching docs into `FixedBitset`. But it seems challenging to take advantage of both specialized BulkScorers and `Scorer#iterato

Re: [PR] OptimisticKnnVectorQuery [lucene]

2025-02-22 Thread via GitHub
dungba88 commented on PR #14226: URL: https://github.com/apache/lucene/pull/14226#issuecomment-2676175583 I ran some benchmark with Cohere 768 dataset for 3 different algorithms: (1) the baseline "greedy", (2) this PR "optimistic", and (3) with only "pro-rata". (2) and (3) will converge wit