Re: [I] Lucene99FlatVectorsReader.getFloatVectorValues(): NPE: Cannot read field "vectorEncoding" because "fieldEntry" is null [lucene]

2024-08-07 Thread via GitHub
david-sitsky commented on issue #13626: URL: https://github.com/apache/lucene/issues/13626#issuecomment-2274740734 @benwtrent - is there a change we can make to stop the NPE? At the moment it prevents our product from completing the index operation and is a bit of a blocker.. It wou

Re: [PR] Take advantage of the doc value skipper when it is primary sort [lucene]

2024-08-07 Thread via GitHub
github-actions[bot] commented on PR #13592: URL: https://github.com/apache/lucene/pull/13592#issuecomment-2274597503 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Slightly speed up decoding blocks of postings/freqs/positions. [lucene]

2024-08-07 Thread via GitHub
gsmiller commented on code in PR #13631: URL: https://github.com/apache/lucene/pull/13631#discussion_r1708050384 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/ForDeltaUtil.java: ## @@ -41,10 +54,275 @@ private static void prefixSumOfOnes(long[] arr, long base) {

Re: [PR] Delegating the matches in PointRangeQuery weight to relate method [lucene]

2024-08-07 Thread via GitHub
jainankitk commented on PR #13599: URL: https://github.com/apache/lucene/pull/13599#issuecomment-2274473157 > I think it's a little clunky for the reader at the same time (it's a bit strange to have to pass the same packed value point twice Thanks @gsmiller for providing this feedback

Re: [PR] Compute facets while collecting [lucene]

2024-08-07 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1707996827 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/cutters/ranges/IntervalTracker.java: ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Compute facets while collecting [lucene]

2024-08-07 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1707976141 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/cutters/ranges/LongRangeFacetCutter.java: ## @@ -0,0 +1,413 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Compute facets while collecting [lucene]

2024-08-07 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1707964637 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ordinals/package-info.java: ## @@ -0,0 +1,18 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Compute facets while collecting [lucene]

2024-08-07 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1707953948 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/recorders/FacetRecorder.java: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [I] DocumentsWriterDeleteQueue.getNextSequenceNumber assertion failure seqNo=9 vs maxSeqNo=8 [lucene]

2024-08-07 Thread via GitHub
benwtrent closed issue #13571: DocumentsWriterDeleteQueue.getNextSequenceNumber assertion failure seqNo=9 vs maxSeqNo=8 URL: https://github.com/apache/lucene/issues/13571 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] DocumentsWriterDeleteQueue.getNextSequenceNumber assertion failure seqNo=9 vs maxSeqNo=8 [lucene]

2024-08-07 Thread via GitHub
benwtrent closed issue #13571: DocumentsWriterDeleteQueue.getNextSequenceNumber assertion failure seqNo=9 vs maxSeqNo=8 URL: https://github.com/apache/lucene/issues/13571 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] TestIDVersionPostingsFormat failure [lucene]

2024-08-07 Thread via GitHub
benwtrent closed issue #13127: TestIDVersionPostingsFormat failure URL: https://github.com/apache/lucene/issues/13127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [I] TestIDVersionPostingsFormat failure [lucene]

2024-08-07 Thread via GitHub
benwtrent closed issue #13127: TestIDVersionPostingsFormat failure URL: https://github.com/apache/lucene/issues/13127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] Fix race condition on flush for DWPT seqNo generation [lucene]

2024-08-07 Thread via GitHub
benwtrent merged PR #13627: URL: https://github.com/apache/lucene/pull/13627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
rmuir commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707498886 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
uschindler commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707463821 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorizationProvider.java: ## @@ -79,4 +102,18 @@ public VectorUtilSupport getVectorUtilSu

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
gsmiller commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707451406 ## lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultPostingDecodingUtil.java: ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
gsmiller commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707451406 ## lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultPostingDecodingUtil.java: ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-07 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1707450580 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,44 @@ private TopDocs getLeafResults( } } + private DocId

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
uschindler commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707448409 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Softwar

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
uschindler commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707445042 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Softwar

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-07 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1707422464 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -491,4 +580,44 @@ public int hashCode() { classHash(), contextIdentit

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-07 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1707421233 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,44 @@ private TopDocs getLeafResults( } } + private DocId

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
rmuir commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707377294 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
rmuir commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707373761 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-07 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1707372290 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -70,6 +72,43 @@ public static void search( search(scorer, knnCollector, graph

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-07 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1707371348 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -70,6 +72,43 @@ public static void search( search(scorer, knnCollector, graph

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
rmuir commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707370205 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Fix race condition on flush for DWPT seqNo generation [lucene]

2024-08-07 Thread via GitHub
jpountz commented on code in PR #13627: URL: https://github.com/apache/lucene/pull/13627#discussion_r1707359678 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java: ## @@ -430,10 +430,16 @@ long updateDocuments( } flushingDWPT = flushControl.doAfte

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
jpountz commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707341860 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-07 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1707338145 ## lucene/core/src/java/org/apache/lucene/search/KnnByteVectorQuery.java: ## @@ -72,14 +72,30 @@ public KnnByteVectorQuery(String field, byte[] target, int k) {

[PR] Knn(Float-->Byte)VectorField javadocs update in KnnByteVectorQuery [lucene]

2024-08-07 Thread via GitHub
cpoerschke opened a new pull request, #13637: URL: https://github.com/apache/lucene/pull/13637 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
jpountz commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707332904 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-07 Thread via GitHub
jpountz commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1707325377 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software F

Re: [I] Add nightly test that calculates recall for vector similarity spaces [lucene]

2024-08-07 Thread via GitHub
tteofili commented on issue #13616: URL: https://github.com/apache/lucene/issues/13616#issuecomment-2273758763 also related to this is supporting it in lucene_util: https://github.com/mikemccand/luceneutil/issues/278 -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Fix race condition on flush for DWPT seqNo generation [lucene]

2024-08-07 Thread via GitHub
benwtrent commented on PR #13627: URL: https://github.com/apache/lucene/pull/13627#issuecomment-2273681251 OK, I have confirmed that @s1monw 's proposed fix does work: ``` DocumentsWriterPerThread obtainAndLock() { while (closed == false) { final DocumentsWriterPer

Re: [PR] CandidateMatcher public matching functions [lucene]

2024-08-07 Thread via GitHub
romseygeek commented on PR #13632: URL: https://github.com/apache/lucene/pull/13632#issuecomment-2273613663 Thanks @bjacobowitz! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] CandidateMatcher public matching functions [lucene]

2024-08-07 Thread via GitHub
romseygeek merged PR #13632: URL: https://github.com/apache/lucene/pull/13632 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-07 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1707073496 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsReader.java: ## @@ -82,9 +83,16 @@ protected KnnVectorsReader() {} * @param knnCollector a KnnResults

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-07 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1707074017 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsReader.java: ## @@ -110,9 +118,16 @@ public abstract void search( * @param knnCollector a KnnResults

Re: [PR] Add float|byte vector support to memory index [lucene]

2024-08-07 Thread via GitHub
benwtrent merged PR #13633: URL: https://github.com/apache/lucene/pull/13633 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [I] Add support for reading/writing dense vectors to MemoryIndex [lucene]

2024-08-07 Thread via GitHub
benwtrent closed issue #13584: Add support for reading/writing dense vectors to MemoryIndex URL: https://github.com/apache/lucene/issues/13584 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Fix race condition on flush for DWPT seqNo generation [lucene]

2024-08-07 Thread via GitHub
benwtrent commented on PR #13627: URL: https://github.com/apache/lucene/pull/13627#issuecomment-2273283552 @s1monw I think `deleteQueue.isAdvanced()` is pretty much already this "stale" flag. It indicates that we have solidified how many tasks should be finished and thus we should never reu

Re: [PR] Fix race condition on flush for DWPT seqNo generation [lucene]

2024-08-07 Thread via GitHub
benwtrent commented on code in PR #13627: URL: https://github.com/apache/lucene/pull/13627#discussion_r1706831238 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java: ## @@ -430,10 +430,16 @@ long updateDocuments( } flushingDWPT = flushControl.doAf

Re: [PR] Fix race condition on flush for DWPT seqNo generation [lucene]

2024-08-07 Thread via GitHub
benwtrent commented on code in PR #13627: URL: https://github.com/apache/lucene/pull/13627#discussion_r1706831238 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java: ## @@ -430,10 +430,16 @@ long updateDocuments( } flushingDWPT = flushControl.doAf

Re: [PR] Fix race condition on flush for DWPT seqNo generation [lucene]

2024-08-07 Thread via GitHub
benwtrent commented on code in PR #13627: URL: https://github.com/apache/lucene/pull/13627#discussion_r1706826818 ## lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java: ## @@ -430,10 +430,16 @@ long updateDocuments( } flushingDWPT = flushControl.doAf

Re: [PR] Fix race condition on flush for DWPT seqNo generation [lucene]

2024-08-07 Thread via GitHub
s1monw commented on PR #13627: URL: https://github.com/apache/lucene/pull/13627#issuecomment-2272816300 it might be that my memory is blurry but let me suggest a different way of doing this. When we mark for a full flush in DWPTFlushControl we do lock the DWPTThreadPool for new writers. Onc