Re: [PR] SortedSet DV Multi Range query [lucene]

2025-02-01 Thread via GitHub
mkhludnev commented on code in PR #13974: URL: https://github.com/apache/lucene/pull/13974#discussion_r1938278091 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/search/SortedSetMultiRangeQuery.java: ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [I] Allow skip_factor to be set dynamically within QueryCache [lucene]

2025-02-01 Thread via GitHub
jpountz commented on issue #14183: URL: https://github.com/apache/lucene/issues/14183#issuecomment-2629028239 In general I'm not a fan of exposing tuning knobs just because we can expose them. Deciding when a clause is worth caching feels like something that Lucene is the right decision mak

Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

2025-02-01 Thread via GitHub
vigyasharma commented on PR #14173: URL: https://github.com/apache/lucene/pull/14173#issuecomment-2629167044 > I think this PR is still doing globally unique ordinals for vectors? So, vectors 1, 2, 3 go to document 1 and ordinals 4, 5 go to doc 2? If so, I think we should "bite the bullet"

Re: [PR] Allow `LogMergePolicy` to merge more than `mergeFactor` segments together when the merge is below the min merge size. [lucene]

2025-02-01 Thread via GitHub
jpountz merged PR #14166: URL: https://github.com/apache/lucene/pull/14166 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[PR] Bump floor segment size to 16MB. [lucene]

2025-02-01 Thread via GitHub
jpountz opened a new pull request, #14189: URL: https://github.com/apache/lucene/pull/14189 This bumps the floor segment size from 2MB (`TieredMergePolicy`) / 1.6MB (`LogByteSizeMergePolicy`) to 16MB in Lucene 11. My motivation is that such small segment sizes don't make index structu

Re: [PR] Remove `maxMergeAtOnce` option from `TieredMergePolicy`. [lucene]

2025-02-01 Thread via GitHub
jpountz merged PR #14165: URL: https://github.com/apache/lucene/pull/14165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Fix refill logic in nextDoc(). [lucene]

2025-02-01 Thread via GitHub
jpountz merged PR #14185: URL: https://github.com/apache/lucene/pull/14185 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] Allow skip_factor to be set dynamically within QueryCache [lucene]

2025-02-01 Thread via GitHub
sgup432 commented on issue #14183: URL: https://github.com/apache/lucene/issues/14183#issuecomment-2629157954 We're considering adjusting the skip_factor limits dynamically to optimize query cache usage when needed, especially when it's underutilized. By exposing this as a dynamic cluster s

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-02-01 Thread via GitHub
kaivalnp commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2628899940 I found one way to reduce index-time RAM usage -- turns out the [`FlatVectorsWriter`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/hnsw/FlatVe