[GitHub] [lucene] mikemccand opened a new issue, #12487: Can/should `KnnByte/FloatVectorQuery` carry some human-meaningful opaque `toString` fragment?

2023-08-03 Thread via GitHub
mikemccand opened a new issue, #12487: URL: https://github.com/apache/lucene/issues/12487 ### Description Over in https://github.com/mikemccand/luceneutil/issues/226 while trying to fix a sneaky and long-standing Lucene nightly benchmark non-determinism that affected `VectorSearch` a

[GitHub] [lucene] turingmachine commented on pull request #12485: Fix onlyLongestMatch in DictionaryCompoundWordTokenFilter

2023-08-03 Thread via GitHub
turingmachine commented on PR #12485: URL: https://github.com/apache/lucene/pull/12485#issuecomment-1663953429 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[GitHub] [lucene] turingmachine commented on pull request #12478: Add Option to Set Subtoken Position Increment for Dictonary Decompounder

2023-08-03 Thread via GitHub
turingmachine commented on PR #12478: URL: https://github.com/apache/lucene/pull/12478#issuecomment-1663954051 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[GitHub] [lucene] Frostfire25 commented on issue #12463: Learned sorting algorithm for Lucene

2023-08-03 Thread via GitHub
Frostfire25 commented on issue #12463: URL: https://github.com/apache/lucene/issues/12463#issuecomment-1664368419 Hey, very interested in assisting with the implementation of this algorithm. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [lucene] Jackyrie2 commented on pull request #12480: Enhancement 11236 lazy compute similarity score

2023-08-03 Thread via GitHub
Jackyrie2 commented on PR #12480: URL: https://github.com/apache/lucene/pull/12480#issuecomment-1664597207 Here is a quick re-run of benchmark(100 dim vectors) on the optimized code with a 90% - 10% split on documents addition: Baseline -> old candidate -> optimized candidate: 253