Re: [PR] Remove SingleValueDocValuesFieldUpdates abstract class (only one implementation) [lucene]

2025-01-14 Thread via GitHub
iverase merged PR #14059: URL: https://github.com/apache/lucene/pull/14059 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Implement ACORN-1 search for HNSW [lucene]

2025-01-14 Thread via GitHub
benwtrent commented on PR #14085: URL: https://github.com/apache/lucene/pull/14085#issuecomment-2591446496 Thanks @benchaplin Those constants and numbers are focused on expanding and contracting the graph search as we hit various NSW with more or fewer matching docs. One dicta

Re: [PR] Use read advice consistently in the knn vector formats [lucene]

2025-01-14 Thread via GitHub
github-actions[bot] commented on PR #14076: URL: https://github.com/apache/lucene/pull/14076#issuecomment-2591374368 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Implement ACORN-1 search for HNSW [lucene]

2025-01-14 Thread via GitHub
benchaplin commented on PR #14085: URL: https://github.com/apache/lucene/pull/14085#issuecomment-2591282248 Awesome stuff @benwtrent - thanks for spearheading the luceneutil recall fix, still trying to wrap my head around how I followed so many "patterns" in those numbers during initial dev

Re: [PR] Implement ACORN-1 search for HNSW [lucene]

2025-01-14 Thread via GitHub
benwtrent commented on PR #14085: URL: https://github.com/apache/lucene/pull/14085#issuecomment-2591197302 I updated my branch further. Got some interesting results which indicate that our graph exploration is slightly too expensive (vint reading and graph seek end up dominating the cost),

[PR] Publish build scans to develocity.apache.org [lucene]

2025-01-14 Thread via GitHub
clayburn opened a new pull request, #14140: URL: https://github.com/apache/lucene/pull/14140 This PR migrates the Lucene project to publish Build Scans to the the new Develocity instance at develocity.apache.org. -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] Encode dense blocks of postings as bit sets. [lucene]

2025-01-14 Thread via GitHub
jpountz merged PR #14133: URL: https://github.com/apache/lucene/pull/14133 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Encode dense blocks of postings as bit sets. [lucene]

2025-01-14 Thread via GitHub
msokolov commented on PR #14133: URL: https://github.com/apache/lucene/pull/14133#issuecomment-2590513414 Looks like one of the checks failed with " > org.apache.lucene.index.CheckIndex$CheckIndexException: Field "vector" has repeated neighbors of node 2424 with value 2450" -- unrelat

Re: [PR] Encode dense blocks of postings as bit sets. [lucene]

2025-01-14 Thread via GitHub
msokolov commented on code in PR #14133: URL: https://github.com/apache/lucene/pull/14133#discussion_r1915219571 ## lucene/core/src/java/org/apache/lucene/codecs/lucene101/Lucene101PostingsReader.java: ## @@ -572,7 +597,36 @@ public int freq() throws IOException { }

Re: [PR] Encode dense blocks of postings as bit sets. [lucene]

2025-01-14 Thread via GitHub
jpountz commented on code in PR #14133: URL: https://github.com/apache/lucene/pull/14133#discussion_r1915083479 ## lucene/core/src/java/org/apache/lucene/codecs/lucene101/Lucene101PostingsReader.java: ## @@ -572,7 +597,36 @@ public int freq() throws IOException { } p

Re: [PR] Encode dense blocks of postings as bit sets. [lucene]

2025-01-14 Thread via GitHub
jpountz commented on code in PR #14133: URL: https://github.com/apache/lucene/pull/14133#discussion_r1915082635 ## lucene/core/src/java/org/apache/lucene/codecs/lucene101/Lucene101PostingsWriter.java: ## @@ -405,7 +422,34 @@ private void flushDocBlock(boolean finishTerm) throws

Re: [PR] Encode dense blocks of postings as bit sets. [lucene]

2025-01-14 Thread via GitHub
msokolov commented on code in PR #14133: URL: https://github.com/apache/lucene/pull/14133#discussion_r1914773748 ## lucene/core/src/java/org/apache/lucene/codecs/lucene101/Lucene101PostingsReader.java: ## @@ -572,7 +597,36 @@ public int freq() throws IOException { }

Re: [PR] Integrating GPU based Vector Search using cuVS [lucene]

2025-01-14 Thread via GitHub
benwtrent commented on code in PR #14131: URL: https://github.com/apache/lucene/pull/14131#discussion_r1914748083 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/vectorsearch/CuVSVectorsWriter.java: ## @@ -0,0 +1,402 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] Encode dense blocks of postings as bit sets. [lucene]

2025-01-14 Thread via GitHub
jpountz commented on PR #14133: URL: https://github.com/apache/lucene/pull/14133#issuecomment-2589769229 I merged the removal of the `acceptDocs` parameter to `intoBitSet` so this is now ready for review. -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Remove `acceptDocs` argument from `DocIdSetIterator#intoBitSet` and introduce `Bits#applyMask`. [lucene]

2025-01-14 Thread via GitHub
jpountz merged PR #14134: URL: https://github.com/apache/lucene/pull/14134 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[I] Exception raised when using FixedShingleFilter with WordDelimiterGraphFilter [lucene]

2025-01-14 Thread via GitHub
binshengliu opened a new issue, #14137: URL: https://github.com/apache/lucene/issues/14137 ### Description Hi, I'd like to report an issue using `FixedShingleFilter` with `WordDelimiterGraphFilter`. An exception is raised on the following conditions. * Tokenizer produces 1 token

Re: [PR] Encode dense blocks of postings as bit sets. [lucene]

2025-01-14 Thread via GitHub
jpountz commented on PR #14133: URL: https://github.com/apache/lucene/pull/14133#issuecomment-2589415510 I also ran the benchmark from https://tantivy-search.github.io/bench/ to see if it gives similar feedback. For reference `global` queries means "conjunctions and disjunctions" in this be