Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-03-24 Thread via GitHub
dweiss commented on issue #14257: URL: https://github.com/apache/lucene/issues/14257#issuecomment-2750242048 It's close but there are many differences. See the commit above and try running it on one of the modules, for example suggest (./gradlew -p lucene/suggest spotlessApply). Mayb

Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-03-24 Thread via GitHub
rmuir commented on issue #14257: URL: https://github.com/apache/lucene/issues/14257#issuecomment-2749828583 @dweiss what do you think about exploring the option to swap out spotless `java.GoogleJavaFormatStep` with `java.EclipseJdtFormatterStep` but using google-java format config: https:/

Re: [I] IndexReader#leaves method is slightly confusing [lucene]

2025-03-24 Thread via GitHub
jpountz commented on issue #14367: URL: https://github.com/apache/lucene/issues/14367#issuecomment-2748909976 As per discussion on the linked PR, I don't think that this suggestion really makes things better, hence closing as a "won't fix". -- This is an automated message from the Apache

Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-03-24 Thread via GitHub
rmuir commented on issue #14257: URL: https://github.com/apache/lucene/issues/14257#issuecomment-2749834480 The Palantir one is no better off from what I can tell, it is a fork of the google one with some cosmetic tweaks. I know eclipse JDT doesn't choke on these constructs, although I don'

Re: [I] TestHnswByteVectorGraph.testBuildingJoinSet reproducible failure [lucene]

2025-03-24 Thread via GitHub
mayya-sharipova closed issue #14396: TestHnswByteVectorGraph.testBuildingJoinSet reproducible failure URL: https://github.com/apache/lucene/issues/14396 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] HNSW connect components can take an inordinate amount of time [lucene]

2025-03-24 Thread via GitHub
txwei commented on issue #14214: URL: https://github.com/apache/lucene/issues/14214#issuecomment-2749658069 Can we expose a graph construction parameter in `Lucene99HnswVectorsFormat` to gate the `connectComponents()` call? This would allow us to mitigate this issue while a more comprehensi

Re: [I] TestHnswByteVectorGraph.testBuildingJoinSet reproducible failure [lucene]

2025-03-24 Thread via GitHub
mayya-sharipova closed issue #14396: TestHnswByteVectorGraph.testBuildingJoinSet reproducible failure URL: https://github.com/apache/lucene/issues/14396 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] Fix TestHnswByteVectorGraph.testBuildingJoinSet [lucene]

2025-03-24 Thread via GitHub
mayya-sharipova opened a new pull request, #14398: URL: https://github.com/apache/lucene/pull/14398 This test fails when the number of documents is low. This change ensures that the number of documents is high enough Relates to #14331 Closes #14396 -- This is an automated

Re: [I] Opening of vector files with ReadAdvice.RANDOM_PRELOAD [lucene]

2025-03-24 Thread via GitHub
msokolov commented on issue #14348: URL: https://github.com/apache/lucene/issues/14348#issuecomment-2749231567 A question I have is how unused fields (at Search time) are handled. I guess their files may have been opened, and thus preloaded? But this could be bad. We often have people creat

Re: [PR] Add leafReaders() Method to IndexReader and Unit Test [lucene]

2025-03-24 Thread via GitHub
jpountz commented on PR #14370: URL: https://github.com/apache/lucene/pull/14370#issuecomment-2748907675 Agreed with @benwtrent and @jainankitk. I'm closing this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Support load per-iteration replacement of NamedSPI [lucene]

2025-03-24 Thread via GitHub
jpountz commented on PR #14275: URL: https://github.com/apache/lucene/pull/14275#issuecomment-2748960716 I've been thinking a bit more about this. The two potential use-cases that come to mind are the following: - Diverging from the way Lucene does memory management (off-heap / on-heap

Re: [PR] Add support for two-phase iterators to DenseConjunctionBulkScorer. [lucene]

2025-03-24 Thread via GitHub
jpountz commented on code in PR #14359: URL: https://github.com/apache/lucene/pull/14359#discussion_r2010621926 ## lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java: ## @@ -238,9 +296,79 @@ private void scoreWindowUsingBitSet( windowMatches.clear

Re: [PR] Add leafReaders() Method to IndexReader and Unit Test [lucene]

2025-03-24 Thread via GitHub
jpountz closed pull request #14370: Add leafReaders() Method to IndexReader and Unit Test URL: https://github.com/apache/lucene/pull/14370 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] IndexReader#leaves method is slightly confusing [lucene]

2025-03-24 Thread via GitHub
jpountz closed issue #14367: IndexReader#leaves method is slightly confusing URL: https://github.com/apache/lucene/issues/14367 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Remove nonexistent PackedBlockLength reference in document [lucene]

2025-03-24 Thread via GitHub
jpountz merged PR #14377: URL: https://github.com/apache/lucene/pull/14377 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Add support for two-phase iterators to DenseConjunctionBulkScorer. [lucene]

2025-03-24 Thread via GitHub
jpountz commented on code in PR #14359: URL: https://github.com/apache/lucene/pull/14359#discussion_r2010628644 ## lucene/core/src/java/org/apache/lucene/search/DocValuesRangeIterator.java: ## @@ -210,6 +210,14 @@ public final boolean matches() throws IOException { }; }

Re: [PR] Implement #docIDRunEnd() on PostingsEnum. [lucene]

2025-03-24 Thread via GitHub
jpountz merged PR #14390: URL: https://github.com/apache/lucene/pull/14390 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Disable sort optimization when tracking all docs [lucene]

2025-03-24 Thread via GitHub
jpountz commented on PR #14395: URL: https://github.com/apache/lucene/pull/14395#issuecomment-2748864218 The change looks correct to me. With recent changes to allow clauses that match all docs to remove themselves from a conjunction, it should be possible to achieve something similar by im

[PR] Disable sort optimization when tracking all docs [lucene]

2025-03-24 Thread via GitHub
bugmakerr opened a new pull request, #14395: URL: https://github.com/apache/lucene/pull/14395 ### Description When `totalHitsThreshold` is greater than or equal to number of docs, we can disable the sort optimization. -- This is an automated message from the Apache Git

Re: [PR] Make PointValues.intersect iterative instead of recursive [lucene]

2025-03-24 Thread via GitHub
original-brownbear commented on code in PR #14391: URL: https://github.com/apache/lucene/pull/14391#discussion_r2010353392 ## lucene/core/src/java/org/apache/lucene/index/PointValues.java: ## @@ -351,35 +351,32 @@ public final void intersect(IntersectVisitor visitor) throws IOE

Re: [PR] Pack file pointers when merging BKD trees [lucene]

2025-03-24 Thread via GitHub
iverase commented on code in PR #14393: URL: https://github.com/apache/lucene/pull/14393#discussion_r2010066670 ## lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java: ## @@ -596,9 +606,12 @@ private IORunnable writeField1Dim( MutablePointTree reader) th

Re: [PR] Make PointValues.intersect iterative instead of recursive [lucene]

2025-03-24 Thread via GitHub
original-brownbear merged PR #14391: URL: https://github.com/apache/lucene/pull/14391 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...

Re: [PR] Make PointValues.intersect iterative instead of recursive [lucene]

2025-03-24 Thread via GitHub
original-brownbear commented on PR #14391: URL: https://github.com/apache/lucene/pull/14391#issuecomment-2748319202 Thanks Ignacio! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[PR] cache preset dict for LZ4WithPresetDictDecompressor [lucene]

2025-03-24 Thread via GitHub
kkewwei opened a new pull request, #14397: URL: https://github.com/apache/lucene/pull/14397 ### Description As mentioned in #14347, we use `LZ4WithPresetDictDecompressor` to decompress, we will always read preset dict for every doc in non-merge scenarios. If two consecutive documents fa

[I] TestHnswByteVectorGraph.testBuildingJoinSet reproducible failure [lucene]

2025-03-24 Thread via GitHub
iverase opened a new issue, #14396: URL: https://github.com/apache/lucene/issues/14396 ### Description The test seems to consistently fail for nDoc = 2 and it fails most of the time for other low values of nDoc. ### Gradle command to reproduce ``` ./gradlew t

Re: [PR] Add support for two-phase iterators to DenseConjunctionBulkScorer. [lucene]

2025-03-24 Thread via GitHub
jpountz commented on code in PR #14359: URL: https://github.com/apache/lucene/pull/14359#discussion_r2010047831 ## lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java: ## @@ -238,9 +296,77 @@ private void scoreWindowUsingBitSet( windowMatches.clear

Re: [PR] Fix potential file handle leak in Lucene102BinaryQuantizedVectorsWriter [lucene]

2025-03-24 Thread via GitHub
iverase commented on PR #14394: URL: https://github.com/apache/lucene/pull/14394#issuecomment-2748042773 I am skipping the entry in Changes as it is an unreleased issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Add support for two-phase iterators to DenseConjunctionBulkScorer. [lucene]

2025-03-24 Thread via GitHub
gf2121 commented on code in PR #14359: URL: https://github.com/apache/lucene/pull/14359#discussion_r2010115706 ## lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java: ## @@ -238,9 +296,79 @@ private void scoreWindowUsingBitSet( windowMatches.clear(

Re: [PR] Fix potential file handle leak in Lucene102BinaryQuantizedVectorsWriter [lucene]

2025-03-24 Thread via GitHub
iverase merged PR #14394: URL: https://github.com/apache/lucene/pull/14394 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] Support modifying segmentInfos.counter in IndexWriter [lucene]

2025-03-24 Thread via GitHub
guojialiang92 commented on issue #14362: URL: https://github.com/apache/lucene/issues/14362#issuecomment-2747968162 Hi @vigyasharma I will explain the usage scenarios. In the **segment replication** scenario, the file name and content of the replica and the primary shard will be c

Re: [PR] Add support for two-phase iterators to DenseConjunctionBulkScorer. [lucene]

2025-03-24 Thread via GitHub
gf2121 commented on code in PR #14359: URL: https://github.com/apache/lucene/pull/14359#discussion_r2010045743 ## lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java: ## @@ -238,9 +296,77 @@ private void scoreWindowUsingBitSet( windowMatches.clear(

Re: [PR] Add support for two-phase iterators to DenseConjunctionBulkScorer. [lucene]

2025-03-24 Thread via GitHub
jpountz commented on code in PR #14359: URL: https://github.com/apache/lucene/pull/14359#discussion_r2010057350 ## lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java: ## @@ -238,9 +296,77 @@ private void scoreWindowUsingBitSet( windowMatches.clear

Re: [PR] Pack file pointers when merging BKD trees [lucene]

2025-03-24 Thread via GitHub
jpountz commented on code in PR #14393: URL: https://github.com/apache/lucene/pull/14393#discussion_r2010004639 ## lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java: ## @@ -596,9 +606,12 @@ private IORunnable writeField1Dim( MutablePointTree reader) th

Re: [PR] Pack file pointers when merging BKD trees [lucene]

2025-03-24 Thread via GitHub
iverase commented on code in PR #14393: URL: https://github.com/apache/lucene/pull/14393#discussion_r2010030440 ## lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java: ## @@ -596,9 +606,12 @@ private IORunnable writeField1Dim( MutablePointTree reader) th

Re: [PR] Add support for two-phase iterators to DenseConjunctionBulkScorer. [lucene]

2025-03-24 Thread via GitHub
jpountz commented on code in PR #14359: URL: https://github.com/apache/lucene/pull/14359#discussion_r2010038886 ## lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java: ## @@ -238,9 +296,77 @@ private void scoreWindowUsingBitSet( windowMatches.clear

Re: [PR] Pack file pointers when merging BKD trees [lucene]

2025-03-24 Thread via GitHub
jpountz commented on code in PR #14393: URL: https://github.com/apache/lucene/pull/14393#discussion_r2010037158 ## lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java: ## @@ -596,9 +606,12 @@ private IORunnable writeField1Dim( MutablePointTree reader) th

Re: [PR] Implement #docIDRunEnd() on PostingsEnum. [lucene]

2025-03-24 Thread via GitHub
gf2121 commented on code in PR #14390: URL: https://github.com/apache/lucene/pull/14390#discussion_r2010019135 ## lucene/core/src/java/org/apache/lucene/codecs/lucene101/Lucene101PostingsReader.java: ## @@ -1059,6 +1059,32 @@ private void bufferIntoBitSet(int start, int end, Fi

Re: [PR] Implement #docIDRunEnd() on PostingsEnum. [lucene]

2025-03-24 Thread via GitHub
jpountz commented on code in PR #14390: URL: https://github.com/apache/lucene/pull/14390#discussion_r2010017533 ## lucene/core/src/java/org/apache/lucene/codecs/lucene101/Lucene101PostingsReader.java: ## @@ -1059,6 +1059,29 @@ private void bufferIntoBitSet(int start, int end, F

Re: [PR] Speed up advancing within a sparse block in IndexedDISI. [lucene]

2025-03-24 Thread via GitHub
gf2121 commented on PR #14371: URL: https://github.com/apache/lucene/pull/14371#issuecomment-2747159521 Thanks for running benchmark @vsop-479 ! > Maybe I should measure it with DVBench in luceneutil, or add a bench in jmh. Yes, you are right, a bench in jmh will be great. We h

[PR] Fix potential file handle leak in Lucene102BinaryQuantizedVectorsWriter [lucene]

2025-03-24 Thread via GitHub
iverase opened a new pull request, #14394: URL: https://github.com/apache/lucene/pull/14394 This test is failing after a recent change in brach 10x: ``` ./gradlew test --tests TestLucene102HnswBinaryQuantizedVectorsFormat.testRandomExceptions -Dtests.seed=22B041A383A5A23E -Dtests.