Re: [PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]

2025-04-05 Thread via GitHub
jainankitk commented on code in PR #14267: URL: https://github.com/apache/lucene/pull/14267#discussion_r2026298155 ## lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java: ## @@ -517,6 +623,11 @@ public byte[] getUpperPoint() { return upperPoint.clone(); }

Re: [PR] Add support for determining off-heap memory requirements for KnnVectorsReader [lucene]

2025-04-05 Thread via GitHub
mayya-sharipova commented on code in PR #14426: URL: https://github.com/apache/lucene/pull/14426#discussion_r2027061812 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsReader.java: ## @@ -130,4 +134,56 @@ public KnnVectorsReader getMergeInstance() { * The default

Re: [PR] PointInSetQuery clips segments by lower and upper [lucene]

2025-04-05 Thread via GitHub
hanbj commented on code in PR #14268: URL: https://github.com/apache/lucene/pull/14268#discussion_r2020444502 ## lucene/core/src/java/org/apache/lucene/search/PointInSetQuery.java: ## @@ -122,6 +126,11 @@ protected PointInSetQuery(String field, int numDims, int bytesPerDim, Str

Re: [PR] Speed up histogram collection in a similar way as disjunction counts. [lucene]

2025-04-05 Thread via GitHub
jpountz merged PR #14273: URL: https://github.com/apache/lucene/pull/14273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Adds github action to verify changelog entry and set milestone to PRs [lucene]

2025-04-05 Thread via GitHub
javanna commented on PR #14279: URL: https://github.com/apache/lucene/pull/14279#issuecomment-2737697207 Hey @stefanvodita the changelog entry for this was filed under 10.2, but I don't believe the change itself was backported. Can you double check and either backport or move the changelog

Re: [PR] Add support for determining off-heap memory requirements for KnnVectorsReader [lucene]

2025-04-05 Thread via GitHub
ChrisHegarty commented on code in PR #14426: URL: https://github.com/apache/lucene/pull/14426#discussion_r202743 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsReader.java: ## @@ -130,4 +134,56 @@ public KnnVectorsReader getMergeInstance() { * The default imp

[PR] fix TestIndexWriterWithThreads#testIOExceptionDuringAbortWithThreadsOnlyOnce [lucene]

2025-04-05 Thread via GitHub
guojialiang92 opened a new pull request, #14424: URL: https://github.com/apache/lucene/pull/14424 ### Description This PR aims to address issue [14423](https://github.com/apache/lucene/issues/14423). ### Tests 1. In order to stabilize the reproduce problem, I added a

Re: [PR] MultiRange query for SortedNumericc DocValues [lucene]

2025-04-05 Thread via GitHub
mkhludnev commented on code in PR #14404: URL: https://github.com/apache/lucene/pull/14404#discussion_r2013967382 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/search/SortedNumericDocValuesMultiRangeQuery.java: ## @@ -0,0 +1,249 @@ +/* + * Licensed to the Apache Software

Re: [PR] quick exit on filter query matching no docs when rewriting knn query [lucene]

2025-04-05 Thread via GitHub
jpountz commented on PR #14418: URL: https://github.com/apache/lucene/pull/14418#issuecomment-2762525239 Can you help me understand what work this change helps save? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Allow skip cache factor to be updated dynamically [lucene]

2025-04-05 Thread via GitHub
sgup432 commented on PR #14412: URL: https://github.com/apache/lucene/pull/14412#issuecomment-2763186278 @jpountz Added a CHANGES entry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Pack file pointers when merging BKD trees [lucene]

2025-04-05 Thread via GitHub
benwtrent commented on code in PR #14393: URL: https://github.com/apache/lucene/pull/14393#discussion_r2010085479 ## lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java: ## @@ -1961,7 +1989,7 @@ private void build( int leafCardinality = heapSource.computeCardin

Re: [I] Examine the affects of MADV_RANDOM when MGLRU is enabled in Linux kernel [lucene]

2025-04-05 Thread via GitHub
jimczi commented on issue #14408: URL: https://github.com/apache/lucene/issues/14408#issuecomment-2755375551 I believe the question is whether we need to reconsider our assumptions when defaulting to random read advice in the current code. With the linked change, using `MADV_RANDOM` will ex

Re: [PR] PointInSetQuery early exit on non-matching segments [lucene]

2025-04-05 Thread via GitHub
hanbj commented on code in PR #14268: URL: https://github.com/apache/lucene/pull/14268#discussion_r2022086841 ## lucene/core/src/java/org/apache/lucene/search/PointInSetQuery.java: ## @@ -248,6 +255,33 @@ public long cost() { } } + private boolean checkVal

Re: [PR] KeywordField.newSetQuery() to uses prefixed terms for IndexOrDocValuesQuery [lucene]

2025-04-05 Thread via GitHub
jainankitk commented on code in PR #14435: URL: https://github.com/apache/lucene/pull/14435#discussion_r2027694440 ## lucene/core/src/java/org/apache/lucene/document/KeywordField.java: ## @@ -175,9 +174,8 @@ public static Query newExactQuery(String field, String value) { pub

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006876856 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[PR] New IndexReaderFunctions.positionLength from the norm [lucene]

2025-04-05 Thread via GitHub
dsmiley opened a new pull request, #14433: URL: https://github.com/apache/lucene/pull/14433 ### Description Introduces `org.apache.lucene.queries.function.IndexReaderFunctions#positionLength` Javadocs: > Creates a value source that returns the position length (number of term

Re: [PR] Speedup merging of HNSW graphs [lucene]

2025-04-05 Thread via GitHub
mayya-sharipova commented on code in PR #14331: URL: https://github.com/apache/lucene/pull/14331#discussion_r2005462586 ## lucene/core/src/java/org/apache/lucene/util/hnsw/ConcurrentHnswMerger.java: ## @@ -51,19 +57,85 @@ protected HnswBuilder createBuilder(KnnVectorValues merg

Re: [I] Address gradle temp file pollution insanity [lucene]

2025-04-05 Thread via GitHub
dweiss commented on issue #14385: URL: https://github.com/apache/lucene/issues/14385#issuecomment-2743732858 I think the hack we had in https://github.com/apache/lucene-solr/pull/1767/files used to work but gradle must have relocated those temp files... The fix is simple but I'd lik

Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-04-05 Thread via GitHub
rmuir commented on PR #14381: URL: https://github.com/apache/lucene/pull/14381#issuecomment-2743822277 @dweiss thanks for the suggestion there, gazillions of array creations avoided. so now this thing will only spike cpu during parsing at worst. I honestly forget you can pass functions to f

Re: [PR] upgrade icu dependency from 74.2 -> 77.1 [lucene]

2025-04-05 Thread via GitHub
rmuir merged PR #14386: URL: https://github.com/apache/lucene/pull/14386 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006940286 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-04-05 Thread via GitHub
rmuir commented on code in PR #14381: URL: https://github.com/apache/lucene/pull/14381#discussion_r2007499003 ## lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java: ## @@ -778,6 +786,53 @@ private int[] toCaseInsensitiveChar(int codepoint) { } } + /** +

Re: [PR] Completion FSTs to be loaded off-heap by default [lucene]

2025-04-05 Thread via GitHub
javanna commented on code in PR #14364: URL: https://github.com/apache/lucene/pull/14364#discussion_r2000872434 ## lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestSuggestField.java: ## @@ -951,7 +951,16 @@ static IndexWriterConfig iwcWithSuggestField(Analyz

Re: [I] TestIndexSortBackwardsCompatibility.testSortedIndexAddDocBlocks fails reproducibly [lucene]

2025-04-05 Thread via GitHub
dweiss closed issue #14344: TestIndexSortBackwardsCompatibility.testSortedIndexAddDocBlocks fails reproducibly URL: https://github.com/apache/lucene/issues/14344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Preparing existing profiler for adding concurrent profiling [lucene]

2025-04-05 Thread via GitHub
jainankitk commented on PR #14413: URL: https://github.com/apache/lucene/pull/14413#issuecomment-2762048902 > You just need to replace ctx with _. Ah, my bad! I tried `.`, but we can't use that as part of variable name. Thanks for the suggestion @jpountz. At a high level, I hav

Re: [I] ParallelLeafReader.getTermVectors can indirectly load TVs multiple times [LUCENE-6868] [lucene]

2025-04-05 Thread via GitHub
vigyasharma closed issue #7926: ParallelLeafReader.getTermVectors can indirectly load TVs multiple times [LUCENE-6868] URL: https://github.com/apache/lucene/issues/7926 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Add support for determining off-heap memory requirements for KnnVectorsReader [lucene]

2025-04-05 Thread via GitHub
jimczi commented on code in PR #14426: URL: https://github.com/apache/lucene/pull/14426#discussion_r2027392059 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsReader.java: ## @@ -130,4 +134,56 @@ public KnnVectorsReader getMergeInstance() { * The default implement

Re: [PR] New IndexReaderFunctions.positionLength from the norm [lucene]

2025-04-05 Thread via GitHub
bruno-roustant commented on PR #14433: URL: https://github.com/apache/lucene/pull/14433#issuecomment-2777888670 Why not numTerms() instead of positionLength()? Inside Similarity.computeNorm(), the value is named numTerms. -- This is an automated message from the Apache Git Service. To r

[PR] Add CaseFolding.fold(), inverse of expand(), move to UnicodeUtil, add filter [lucene]

2025-04-05 Thread via GitHub
rmuir opened a new pull request, #14389: URL: https://github.com/apache/lucene/pull/14389 Regexp has the ability to erase case differences at query time (the slow way), but there's no corresponding ability to do it the fast-way: at index time. There's LowerCaseFilter, but LowerCaseFil

Re: [PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]

2025-04-05 Thread via GitHub
jainankitk commented on PR #14267: URL: https://github.com/apache/lucene/pull/14267#issuecomment-2773131906 @hanbj - Thanks for patiently addressing the review comments. While I don't see any performance regression risk myself, I am wondering if we can do one quick performance benchmark run

Re: [PR] Support modifying segmentInfos.counter in IndexWriter [lucene]

2025-04-05 Thread via GitHub
guojialiang92 commented on PR #14417: URL: https://github.com/apache/lucene/pull/14417#issuecomment-2766116736 Thanks, @vigyasharma I also looked at Lucene's native segment replication, just sharing my personal opinion. > Also, IIUC `IndexWriter#advanceSegmentInfosVersion()` was a

Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-04-05 Thread via GitHub
dweiss commented on issue #14257: URL: https://github.com/apache/lucene/issues/14257#issuecomment-2755082414 I've toyed with it a bit but I don't see a way for it to not break those /// comments. An alternative is to fork it, fix what we need and then use the forked version from spotless. T

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2771814782 I roughly implemented the idea. This is my first time forking a new codec, hopefully have not made too many mistakes :) A few thoughts during my refactoring: * I thought i on

Re: [I] Incorrect use of fsync [lucene]

2025-04-05 Thread via GitHub
rmuir commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2772221194 Nobody needs to fsync any temporary files, ever. They are temporary: we don't need them durable. Look at how lucene uses temporary files to understand this. We don't need suc

Re: [PR] bump antlr 4.11.1 -> 4.13.2 [lucene]

2025-04-05 Thread via GitHub
rmuir commented on code in PR #14388: URL: https://github.com/apache/lucene/pull/14388#discussion_r2008139072 ## lucene/expressions/src/generated/checksums/generateAntlr.json: ## @@ -1,7 +1,8 @@ { "lucene/expressions/src/java/org/apache/lucene/expressions/js/Javascript.g4

Re: [PR] New IndexReaderFunctions.positionLength from the norm [lucene]

2025-04-05 Thread via GitHub
dsmiley commented on PR #14433: URL: https://github.com/apache/lucene/pull/14433#issuecomment-2780732429 `fieldLength` works for me. I'd like `fieldPositionLength` more as it characterizes the basis of the length (it's not characters). BTW some other methods on this class don't have "fiel

Re: [PR] KeywordField.newSetQuery() to reuse prefixed terms in IndexOrDocValuesQuery [lucene]

2025-04-05 Thread via GitHub
mkhludnev commented on code in PR #14435: URL: https://github.com/apache/lucene/pull/14435#discussion_r2029801915 ## lucene/core/src/java/org/apache/lucene/document/KeywordField.java: ## @@ -175,9 +174,8 @@ public static Query newExactQuery(String field, String value) { publ

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-05 Thread via GitHub
jpountz commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2780641860 > and some deletes being addressed is better than none. This part of your message suggests that deletes get reclaimed progressively over time, which is often not true. So wait

Re: [PR] New IndexReaderFunctions.positionLength from the norm [lucene]

2025-04-05 Thread via GitHub
jpountz commented on PR #14433: URL: https://github.com/apache/lucene/pull/14433#issuecomment-2780644329 What about calling it just "field length", since this is the length as computed for the purpose of length normalization? -- This is an automated message from the Apache Git Service. To

Re: [PR] Allow skip cache factor to be updated dynamically [lucene]

2025-04-05 Thread via GitHub
sgup432 commented on code in PR #14412: URL: https://github.com/apache/lucene/pull/14412#discussion_r2019109527 ## lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java: ## @@ -122,12 +123,30 @@ public LRUQueryCache( long maxRamBytesUsed, Predicate leave

Re: [PR] Let Decompressor implement the Closeable interface. [lucene]

2025-04-05 Thread via GitHub
jpountz commented on PR #14438: URL: https://github.com/apache/lucene/pull/14438#issuecomment-2778028781 Unfortunately, you can't easily use close() to release resources from a Decompressor, because `StoredFieldsReader` is cloneable, and close() is never called on the clones. The only worka

[PR] KeywordField.newSetQuery() to uses prefixed terms for IndexOrDocValuesQuery [lucene]

2025-04-05 Thread via GitHub
mkhludnev opened a new pull request, #14435: URL: https://github.com/apache/lucene/pull/14435 fix #14425 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[I] QueryParser parsing a phrase with a wildcard [lucene]

2025-04-05 Thread via GitHub
viliam-durina opened a new issue, #14440: URL: https://github.com/apache/lucene/issues/14440 ### Description Hi all, I have tried to parse this query using the classic QueryParser: String sQuery = "\"foo bar*\""; The query was parsed into a PhraseQuery with two t

[PR] Use FixedLengthBytesRefArray in OneDimensionBKDWriter to hold split values [lucene]

2025-04-05 Thread via GitHub
iverase opened a new pull request, #14383: URL: https://github.com/apache/lucene/pull/14383 We are currently using a list which feels wasteful. For example looking into the heap dump on an IP field, we were using almost double of the heap necessary to hold the split values: https://g

Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

2025-04-05 Thread via GitHub
benwtrent commented on PR #14173: URL: https://github.com/apache/lucene/pull/14173#issuecomment-2744242199 > do you confirm that, according to your knowledge, any relevant and active work toward multi-valued vectors in Lucene is effectively aggregated here? @alessandrobenedetti I thin

[PR] Speedup merging of HNSW graphs (#14331) [lucene]

2025-04-05 Thread via GitHub
mayya-sharipova opened a new pull request, #14380: URL: https://github.com/apache/lucene/pull/14380 Backport for #14331 Currently when doing merging of HNSW graphs incrementally, we first initialize a graph from the biggest segment, and for other segments, we rebuild the graphs compl

Re: [PR] Handle NaN results in TestVectorUtilSupport.testBinaryVectors [lucene]

2025-04-05 Thread via GitHub
benwtrent commented on code in PR #14419: URL: https://github.com/apache/lucene/pull/14419#discussion_r2018509188 ## lucene/core/src/test/org/apache/lucene/internal/vectorization/TestVectorUtilSupport.java: ## @@ -210,9 +210,13 @@ public void testMinMaxScalarQuantize() { }

Re: [PR] KeywordField.newSetQuery() to reuse prefixed terms in IndexOrDocValuesQuery [lucene]

2025-04-05 Thread via GitHub
jainankitk commented on code in PR #14435: URL: https://github.com/apache/lucene/pull/14435#discussion_r2029926829 ## lucene/core/src/java/org/apache/lucene/document/KeywordField.java: ## @@ -175,9 +174,8 @@ public static Query newExactQuery(String field, String value) { pub

Re: [PR] build: generate CaseFolding.java from "gradle regenerate" [lucene]

2025-04-05 Thread via GitHub
uschindler commented on code in PR #14384: URL: https://github.com/apache/lucene/pull/14384#discussion_r2008497905 ## lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java: ## @@ -759,23 +759,14 @@ private Automaton toAutomaton( * @return the original codepoint a

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005470873 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [I] IndexReader#leaves method is slightly confusing [lucene]

2025-04-05 Thread via GitHub
jpountz commented on issue #14367: URL: https://github.com/apache/lucene/issues/14367#issuecomment-2748919960 Hmm, maybe I closed a bit too quickly as this issue only pointed out confusion with `IndexReader#leaves`, it did not suggest a particular approach. That said, I'm aligned with

[PR] Adding TestSpanWithinQuery with basic test cases for SpanWithinQuery [lucene]

2025-04-05 Thread via GitHub
slow-J opened a new pull request, #14405: URL: https://github.com/apache/lucene/pull/14405 TEST: ./gradlew check ### Description I was looking at an old issue https://github.com/apache/lucene/issues/7145 which talks about unit tests for SpanWithinQuery. I noticed that there was

Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-04-05 Thread via GitHub
rmuir commented on issue #14257: URL: https://github.com/apache/lucene/issues/14257#issuecomment-2754255056 @dweiss I also wonder, with an "autoformat" workflow, if we even care so much. I don't understand what is so sacrosanct about google's format: to me it is ugly. Snippet tag is

Re: [PR] Support modifying segmentInfos.counter in IndexWriter [lucene]

2025-04-05 Thread via GitHub
vigyasharma commented on PR #14417: URL: https://github.com/apache/lucene/pull/14417#issuecomment-2764418906 I think we can add a couple more tests to make it robust. 1. Some tests around concurrency – index with multiple threads, then advance the counter in one of the threads, and valid

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on code in PR #14401: URL: https://github.com/apache/lucene/pull/14401#discussion_r2019735302 ## lucene/test-framework/src/java/org/apache/lucene/tests/search/AssertingLeafCollector.java: ## @@ -50,6 +50,14 @@ public void collect(DocIdStream stream) throws IOExc

Re: [PR] Adds github action to verify changelog entry and set milestone to PRs [lucene]

2025-04-05 Thread via GitHub
stefanvodita commented on PR #14279: URL: https://github.com/apache/lucene/pull/14279#issuecomment-2743574250 Thanks for pointing that out @javanna! Funny how that happened on a PR that's specifically about the changelog. We should only push this to main. I'll actually delete the entry for

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006885578 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieReader.java: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[PR] KeywordField.newSetQuery() to reuse prefixed terms in IndexOrDocValue… [lucene]

2025-04-05 Thread via GitHub
mkhludnev opened a new pull request, #14442: URL: https://github.com/apache/lucene/pull/14442 …sQuery (#14435) * KeywordField.newSetQuery() reuses prefixed terms. fix #14425 ### Description -- This is an automated message from the Apache Git Service. To res

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2022727361 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [I] Reuse packedTerms between two TermInSetQuery which are combined by IndexOrDocValuesQuery [lucene]

2025-04-05 Thread via GitHub
mkhludnev closed issue #14425: Reuse packedTerms between two TermInSetQuery which are combined by IndexOrDocValuesQuery URL: https://github.com/apache/lucene/issues/14425 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] Reuse packedTerms between two TermInSetQuery which are combined by IndexOrDocValuesQuery [lucene]

2025-04-05 Thread via GitHub
mkhludnev closed issue #14425: Reuse packedTerms between two TermInSetQuery which are combined by IndexOrDocValuesQuery URL: https://github.com/apache/lucene/issues/14425 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] KeywordField.newSetQuery() to reuse prefixed terms in IndexOrDocValuesQuery [lucene]

2025-04-05 Thread via GitHub
mkhludnev merged PR #14435: URL: https://github.com/apache/lucene/pull/14435 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Optimize commit retention policy to maintain only the last 5 commits [lucene]

2025-04-05 Thread via GitHub
vigyasharma commented on PR #14325: URL: https://github.com/apache/lucene/pull/14325#issuecomment-2781125749 This PR changes the existing `KeepLastCommitDeletionPolicy` which is not what we want. I've created a new, beginner issue, #1 that specifies the requirements from this task. --

Re: [PR] Optimize commit retention policy to maintain only the last 5 commits [lucene]

2025-04-05 Thread via GitHub
vigyasharma closed pull request #14325: Optimize commit retention policy to maintain only the last 5 commits URL: https://github.com/apache/lucene/pull/14325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Revert "Add UnwrappingReuseStrategy for AnalyzerWrapper (#14154)" [lucene]

2025-04-05 Thread via GitHub
mayya-sharipova merged PR #14437: URL: https://github.com/apache/lucene/pull/14437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lu

Re: [PR] [DRAFT] Case-insensitive matching over union of strings [lucene]

2025-04-05 Thread via GitHub
rmuir commented on code in PR #14350: URL: https://github.com/apache/lucene/pull/14350#discussion_r2000536590 ## lucene/core/src/java/org/apache/lucene/util/automaton/CaseFolding.java: ## @@ -743,4 +743,42 @@ static int[] lookupAlternates(int codepoint) { return alts;

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-04-05 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2022767256 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,632 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [I] Reuse packedTerms between two TermInSetQuery which are combined by IndexOrDocValuesQuery [lucene]

2025-04-05 Thread via GitHub
mkhludnev commented on issue #14425: URL: https://github.com/apache/lucene/issues/14425#issuecomment-2781083660 To be released in 10.3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] KeywordField.newSetQuery() to reuse prefixed terms in IndexOrDocValue… [lucene]

2025-04-05 Thread via GitHub
mkhludnev merged PR #14442: URL: https://github.com/apache/lucene/pull/14442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

[PR] Support incremental refresh in Searcher Managers. [lucene]

2025-04-05 Thread via GitHub
vigyasharma opened a new pull request, #14443: URL: https://github.com/apache/lucene/pull/14443 In segment based replication systems, a large replication payload (checkpoint) can induce heavy page faults, cause thrashing for in-flight search requests, and affect overall search performance.