Re: [PR] Make Weight#scorerSupplier abstract, Weight#scorer final [lucene]

2024-05-02 Thread via GitHub
iamsanjay commented on code in PR #13319: URL: https://github.com/apache/lucene/pull/13319#discussion_r1587404169 ## lucene/core/src/java/org/apache/lucene/search/FilterWeight.java: ## @@ -58,11 +58,6 @@ public Explanation explain(LeafReaderContext context, int doc) throws IOEx

Re: [PR] Make Weight#scorerSupplier abstract, Weight#scorer final [lucene]

2024-05-02 Thread via GitHub
iamsanjay commented on PR #13319: URL: https://github.com/apache/lucene/pull/13319#issuecomment-2090140897 Instead of both get() and cost() throwing the exception, Can make scorerSupplier throw `UnsupportedOperationException()`? ``` @Override public ScorerSupplier scorerSupplie

Re: [PR] Make Weight#scorerSupplier abstract, Weight#scorer final [lucene]

2024-05-02 Thread via GitHub
iamsanjay commented on code in PR #13319: URL: https://github.com/apache/lucene/pull/13319#discussion_r1587434132 ## lucene/queries/src/java/org/apache/lucene/queries/spans/SpanWeight.java: ## @@ -135,16 +135,6 @@ private Similarity.SimScorer buildSimWeight( public abstract S

Re: [I] TestHnswBitVectorsFormat.testIndexAndSearchBitVectors fails intermittently [lucene]

2024-05-02 Thread via GitHub
benwtrent closed issue #13326: TestHnswBitVectorsFormat.testIndexAndSearchBitVectors fails intermittently URL: https://github.com/apache/lucene/issues/13326 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Fix TestHnswBitVectorsFormat.testIndexAndSearchBitVectors flakiness [lucene]

2024-05-02 Thread via GitHub
benwtrent merged PR #1: URL: https://github.com/apache/lucene/pull/1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Add test case to ensure scalar quantization adheres to known ranges [lucene]

2024-05-02 Thread via GitHub
ChrisHegarty commented on PR #13336: URL: https://github.com/apache/lucene/pull/13336#issuecomment-2090397935 Our scorer implementations should be able to take advantage of the ranges being in: * int7: 0 - 127 (inclusive) * int4: 0 - 15 ( inclusive) -- This is an automated message f

Re: [PR] Add test case to ensure scalar quantization adheres to known ranges [lucene]

2024-05-02 Thread via GitHub
benwtrent merged PR #13336: URL: https://github.com/apache/lucene/pull/13336 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Add new VectorScorer interface to vector value iterators [lucene]

2024-05-02 Thread via GitHub
msokolov commented on code in PR #13181: URL: https://github.com/apache/lucene/pull/13181#discussion_r1587804525 ## lucene/core/src/java/org/apache/lucene/util/quantization/QuantizedByteVectorValues.java: ## @@ -18,13 +18,40 @@ import java.io.IOException; import org.apache.l

Re: [I] Decouple within-query concurrency from the index's segment geometry [LUCENE-8675] [lucene]

2024-05-02 Thread via GitHub
msokolov commented on issue #9721: URL: https://github.com/apache/lucene/issues/9721#issuecomment-2090843987 One thing came up during my testing / messing around that I think could significantly affect the API we provide which is whether we want to bake in the algorithm for computing leaves

[PR] Add IndexInput#prefetch. [lucene]

2024-05-02 Thread via GitHub
jpountz opened a new pull request, #13337: URL: https://github.com/apache/lucene/pull/13337 This adds `IndexInput#prefetch`, which is an optional operation that instructs the `IndexInput` to start fetching bytes from storage in the background. These bytes will be picked up by follow-up call

Re: [I] Decouple within-query concurrency from the index's segment geometry [LUCENE-8675] [lucene]

2024-05-02 Thread via GitHub
stefanvodita commented on issue #9721: URL: https://github.com/apache/lucene/issues/9721#issuecomment-2090878680 > we may want to modify the amount of concurrency we apply to each Query in response to operational conditions I would really like it if we could do this. It could be a ver

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-02 Thread via GitHub
jpountz commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2090882979 I created the following benchmark to simulate lookups in a terms dictionary that cannot fit in the page cache. ```java import java.io.IOException; import java.nio

Re: [PR] Make segment/field attribute updates thread-safe. [lucene]

2024-05-02 Thread via GitHub
jpountz merged PR #13331: URL: https://github.com/apache/lucene/pull/13331 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Add new VectorScorer interface to vector value iterators [lucene]

2024-05-02 Thread via GitHub
benwtrent commented on code in PR #13181: URL: https://github.com/apache/lucene/pull/13181#discussion_r1587945011 ## lucene/core/src/java/org/apache/lucene/search/VectorScorer.java: ## @@ -18,64 +18,39 @@ import java.io.IOException; import org.apache.lucene.index.ByteVectorV

Re: [PR] Remove unused "implements Accountable". [lucene]

2024-05-02 Thread via GitHub
jpountz commented on code in PR #13330: URL: https://github.com/apache/lucene/pull/13330#discussion_r1587950557 ## lucene/codecs/src/java/org/apache/lucene/codecs/blockterms/VariableGapTermsIndexReader.java: ## @@ -168,22 +165,6 @@ public FieldIndexEnum getFieldEnum(FieldInfo fi

Re: [PR] Remove unused "implements Accountable". [lucene]

2024-05-02 Thread via GitHub
jpountz commented on code in PR #13330: URL: https://github.com/apache/lucene/pull/13330#discussion_r1587951466 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ## @@ -121,9 +121,6 @@ void initIndexInput() { public Stats computeBlockS

Re: [PR] Add new VectorScorer interface to vector value iterators [lucene]

2024-05-02 Thread via GitHub
benwtrent commented on code in PR #13181: URL: https://github.com/apache/lucene/pull/13181#discussion_r1587957013 ## lucene/core/src/java/org/apache/lucene/util/quantization/QuantizedByteVectorValues.java: ## @@ -18,13 +18,40 @@ import java.io.IOException; import org.apache.

Re: [PR] Make Weight#scorerSupplier abstract, Weight#scorer final [lucene]

2024-05-02 Thread via GitHub
jpountz commented on PR #13319: URL: https://github.com/apache/lucene/pull/13319#issuecomment-2091034684 > Instead of both get() and cost() throwing the exception, Can make scorerSupplier throw UnsupportedOperationException()? In general yes, there may be a few exceptions though like

Re: [I] Decouple within-query concurrency from the index's segment geometry [LUCENE-8675] [lucene]

2024-05-02 Thread via GitHub
jpountz commented on issue #9721: URL: https://github.com/apache/lucene/issues/9721#issuecomment-2091045497 > I guess one alternative is to maintain multiple IndexSearchers with different characteristics Since IndexSearcher is very cheap to create, you could create a new `IndexSearch

[I] Doc out of order issue from lucene [lucene]

2024-05-02 Thread via GitHub
SaiSatwik opened a new issue, #13338: URL: https://github.com/apache/lucene/issues/13338 ### Description We are seeing docs out of order error multiple times on Opensearch 1.2.3. Seems issue is coming from lucene, but no clue what could be happening under the hood. No much significan

Re: [I] Doc out of order issue from Lucene 8.10.1 [lucene]

2024-05-02 Thread via GitHub
mkhludnev commented on issue #13338: URL: https://github.com/apache/lucene/issues/13338#issuecomment-2091183402 Hi @SaiSatwik Are you running a some sort of test? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] Decouple within-query concurrency from the index's segment geometry [LUCENE-8675] [lucene]

2024-05-02 Thread via GitHub
msokolov commented on issue #9721: URL: https://github.com/apache/lucene/issues/9721#issuecomment-2091192189 I don't know our IndexSearcher looks a little heavy; I think some of that is our own doing and we could tease it apart, but isn't EG the query cache tied to the IndexSearcher? And we

Re: [I] Doc out of order issue from Lucene 8.10.1 [lucene]

2024-05-02 Thread via GitHub
SaiSatwik commented on issue #13338: URL: https://github.com/apache/lucene/issues/13338#issuecomment-2091193420 Hi @mkhludnev No. We have seen this issue in a VM where single node opensearch deployment is running. Due to this issue opensearch index went into RED state, leading to fai

Re: [I] Doc out of order issue from Lucene 8.10.1 [lucene]

2024-05-02 Thread via GitHub
mkhludnev commented on issue #13338: URL: https://github.com/apache/lucene/issues/13338#issuecomment-2091200687 Pardon. I had a wrong clue about a change in test framework occurring later. Now, looking into the version you mention, I realized, it's wrong. Have no idea. Don't you have index

Re: [I] Doc out of order issue from Lucene 8.10.1 [lucene]

2024-05-02 Thread via GitHub
SaiSatwik commented on issue #13338: URL: https://github.com/apache/lucene/issues/13338#issuecomment-2091255775 @mkhludnev , we do not have index sorting configured. But, could you please help me understand how can this issue could be related to index sorting configuration? -- This is an

Re: [I] Doc out of order issue from Lucene 8.10.1 [lucene]

2024-05-02 Thread via GitHub
mkhludnev commented on issue #13338: URL: https://github.com/apache/lucene/issues/13338#issuecomment-2091305382 honestly, have no idea. How many docs you have in this index? May it exceed 2bns? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Add a MemorySegment Vector scorer - for scoring without copying on-heap [lucene]

2024-05-02 Thread via GitHub
ChrisHegarty commented on code in PR #13339: URL: https://github.com/apache/lucene/pull/13339#discussion_r1588276295 ## lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorizationProvider.java: ## @@ -198,6 +201,11 @@ private static void ensureCaller() { priva

Re: [PR] Add a MemorySegment Vector scorer - for scoring without copying on-heap [lucene]

2024-05-02 Thread via GitHub
ChrisHegarty commented on code in PR #13339: URL: https://github.com/apache/lucene/pull/13339#discussion_r1588276752 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentByteVectorScorerSupplier.java: ## @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache

Re: [PR] Add a MemorySegment Vector scorer - for scoring without copying on-heap [lucene]

2024-05-02 Thread via GitHub
ChrisHegarty commented on code in PR #13339: URL: https://github.com/apache/lucene/pull/13339#discussion_r1588277534 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentByteVectorScorerSupplier.java: ## @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache

[I] IndexWriter loses track of parent field when documents fail to use it [lucene]

2024-05-02 Thread via GitHub
msokolov opened a new issue, #13340: URL: https://github.com/apache/lucene/issues/13340 ### Description This test fails with ` java.lang.IllegalArgumentException: can't add a parent field to an already existing index without a parent field`. If you index any documents in the index, u

Re: [I] IndexWriter loses track of parent field when index is empty [lucene]

2024-05-02 Thread via GitHub
msokolov commented on issue #13340: URL: https://github.com/apache/lucene/issues/13340#issuecomment-2091741574 I hope someone who is familiar with this will quickly see the problem - I'm not sure if we (1) maybe fail to write any FieldInfos when the index is empty, but now we must, or (2) f

Re: [I] IndexWriter loses track of parent field when index is empty [lucene]

2024-05-02 Thread via GitHub
msokolov commented on issue #13340: URL: https://github.com/apache/lucene/issues/13340#issuecomment-2091762332 I tried removing ``` if (leaves.isEmpty()) {

Re: [PR] Convert more classes to record classes [lucene]

2024-05-02 Thread via GitHub
uschindler commented on PR #13328: URL: https://github.com/apache/lucene/pull/13328#issuecomment-2091805740 Thanks, looks fine. I have no time to do another closer review, please give me some time to proceed. -- This is an automated message from the Apache Git Service. To respond to the m

Re: [I] IndexWriter loses track of parent field when index is empty [lucene]

2024-05-02 Thread via GitHub
msokolov commented on issue #13340: URL: https://github.com/apache/lucene/issues/13340#issuecomment-2091822754 OK, now I see that field infos is part of the segment so we would not have written it. I guess in this case we should explicitly recognize and work around the empty case. -- Thi

Re: [PR] Convert more classes to record classes [lucene]

2024-05-02 Thread via GitHub
uschindler commented on code in PR #13328: URL: https://github.com/apache/lucene/pull/13328#discussion_r1588461460 ## lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/GeneratingSuggester.java: ## @@ -447,14 +446,8 @@ private static int commonCharacterPositionS

Re: [PR] gh-13340: Allow adding a parent field to an index with no fields [lucene]

2024-05-02 Thread via GitHub
msokolov commented on PR #13341: URL: https://github.com/apache/lucene/pull/13341#issuecomment-2091836416 hm maybe this is not safe? If one creates a doc block filled with empty documents? I'm not sure ... -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Make Weight#scorerSupplier abstract, Weight#scorer final [lucene]

2024-05-02 Thread via GitHub
iamsanjay commented on PR #13319: URL: https://github.com/apache/lucene/pull/13319#issuecomment-2092322936 > In general yes, there may be a few exceptions though like `JustCompileSearch` where we want to keep these methods since it's about detecting API changes. Okay in that case, le

Re: [PR] Make Weight#scorerSupplier abstract, Weight#scorer final [lucene]

2024-05-02 Thread via GitHub
iamsanjay commented on PR #13319: URL: https://github.com/apache/lucene/pull/13319#issuecomment-2092337036 I also changed scorerSupplier in few classes to delegate it to scorerSupplier instead of scorer. -- This is an automated message from the Apache Git Service. To respond to the messa

[PR] Datacube format changes to support materialized views [lucene]

2024-05-02 Thread via GitHub
bharath-techie opened a new pull request, #13342: URL: https://github.com/apache/lucene/pull/13342 ### Description Draft PR for supporting extensions for DataCubes that mainly includes new format , changes in SegmentInfo and in indexingchain flush , SegmentMerger -- This is an