Re: [PR] Speed up advancing within a sparse block in IndexedDISI. [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on PR #14371: URL: https://github.com/apache/lucene/pull/14371#issuecomment-2739828325 Thanks @vsop-479 , have you been able to measure the performance of your patch? I had similar idea recently. If you look at newest code in `Lucene101PostingsReader`, you may find w

Re: [I] Handling concurrent search in QueryProfiler [lucene]

2025-03-20 Thread via GitHub
jainankitk commented on issue #14375: URL: https://github.com/apache/lucene/issues/14375#issuecomment-2739471091 > Maybe it tries to do too much by providing min/avg/max aggregates and it should just provide per-slice breakdowns, leaving whether and how to compile aggregates to the applicat

Re: [PR] Speedup merging of HNSW graphs [lucene]

2025-03-20 Thread via GitHub
mayya-sharipova commented on code in PR #14331: URL: https://github.com/apache/lucene/pull/14331#discussion_r2005416489 ## lucene/core/src/java/org/apache/lucene/util/hnsw/ConcurrentHnswMerger.java: ## @@ -51,19 +57,85 @@ protected HnswBuilder createBuilder(KnnVectorValues merg

Re: [PR] Speedup merging of HNSW graphs [lucene]

2025-03-20 Thread via GitHub
mayya-sharipova commented on code in PR #14331: URL: https://github.com/apache/lucene/pull/14331#discussion_r2005416489 ## lucene/core/src/java/org/apache/lucene/util/hnsw/ConcurrentHnswMerger.java: ## @@ -51,19 +57,85 @@ protected HnswBuilder createBuilder(KnnVectorValues merg

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005519750 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] Adjust equivalent min similarity HNSW exploration logic [lucene]

2025-03-20 Thread via GitHub
mayya-sharipova commented on code in PR #14366: URL: https://github.com/apache/lucene/pull/14366#discussion_r2005708592 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -266,11 +266,21 @@ void searchLevel( // A bound that holds the minimum s

Re: [PR] Adjust equivalent min similarity HNSW exploration logic [lucene]

2025-03-20 Thread via GitHub
mayya-sharipova commented on code in PR #14366: URL: https://github.com/apache/lucene/pull/14366#discussion_r2005709557 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -266,11 +266,21 @@ void searchLevel( // A bound that holds the minimum s

Re: [PR] Speedup merging of HNSW graphs [lucene]

2025-03-20 Thread via GitHub
mayya-sharipova merged PR #14331: URL: https://github.com/apache/lucene/pull/14331 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lu

Re: [I] Case insensitive regex query with character range [lucene]

2025-03-20 Thread via GitHub
rmuir commented on issue #14378: URL: https://github.com/apache/lucene/issues/14378#issuecomment-2740658343 This isn't a bug, regex parser just does not have this feature. We can add it, but it must be an additional opt-in flag due to performance tradeoffs involved. -- This is an a

[I] build support: java 24 [lucene]

2025-03-20 Thread via GitHub
rmuir opened a new issue, #14379: URL: https://github.com/apache/lucene/issues/14379 ### Description java 23 has disappeared and has been replaced with java 24. the build currently requires 23 exactly, which creates a hurdle for users, since it is difficult to get: does not ex

Re: [I] build support: java 24 [lucene]

2025-03-20 Thread via GitHub
dweiss commented on issue #14379: URL: https://github.com/apache/lucene/issues/14379#issuecomment-2740968727 Really nice indeed! Sadly, I think it'll take just about a million years before it propagates through all the layers until it can hit gradle (but I'd love to be proven wrong). ;) -

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006940286 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Speedup merging of HNSW graphs (#14331) [lucene]

2025-03-20 Thread via GitHub
mayya-sharipova merged PR #14380: URL: https://github.com/apache/lucene/pull/14380 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lu

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006901371 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006904572 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnumFrame.java: ## @@ -89,8 +89,6 @@ final class IntersectTermsEnumFrame { final

Re: [PR] RegExp: add CASE_INSENSITIVE_RANGE support [lucene]

2025-03-20 Thread via GitHub
dweiss commented on code in PR #14381: URL: https://github.com/apache/lucene/pull/14381#discussion_r2006930601 ## lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java: ## @@ -778,6 +786,53 @@ private int[] toCaseInsensitiveChar(int codepoint) { } } + /**

Re: [I] NRT replication should make it possible/easy to use bite-sized commits [lucene]

2025-03-20 Thread via GitHub
vigyasharma commented on issue #14219: URL: https://github.com/apache/lucene/issues/14219#issuecomment-2742476154 > The searchers can then carefully pick and choose which commit points they want to switch too, in a bite sized / stepping stone manner The key here is making searchers re

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005462386 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Speedup merging of HNSW graphs [lucene]

2025-03-20 Thread via GitHub
mayya-sharipova commented on code in PR #14331: URL: https://github.com/apache/lucene/pull/14331#discussion_r2005464935 ## lucene/core/src/java/org/apache/lucene/util/hnsw/MergingHnswGraphBuilder.java: ## @@ -0,0 +1,198 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005463552 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005466812 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005473348 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[I] Case insensitive regex query with character range [lucene]

2025-03-20 Thread via GitHub
petrsimon opened a new issue, #14378: URL: https://github.com/apache/lucene/issues/14378 ### Description Hi, I'm implementing regex search in java app with Elasticsearch 8.* and I've noticed unexpected behaviour with `CASE_INSENSITIVE` flag. It seems that Lucene ignores the

Re: [PR] Speed up advancing within a sparse block in IndexedDISI. [lucene]

2025-03-20 Thread via GitHub
vsop-479 commented on PR #14371: URL: https://github.com/apache/lucene/pull/14371#issuecomment-2740387593 Thanks for your feedback @gf2121. This patch is still in process, and have not been measured. > If you look at newest code in Lucene101PostingsReader, you may find we are using

Re: [I] Case insensitive regex query with character range [lucene]

2025-03-20 Thread via GitHub
petrsimon commented on issue #14378: URL: https://github.com/apache/lucene/issues/14378#issuecomment-2740838289 I see, thanks a lot! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] Case insensitive regex query with character range [lucene]

2025-03-20 Thread via GitHub
rmuir commented on issue #14378: URL: https://github.com/apache/lucene/issues/14378#issuecomment-2740916833 If the regexp parser doesn't document it has the feature, then it doesn't support it: https://lucene.apache.org/core/10_1_0/core/org/apache/lucene/util/automaton/RegExp.html -- Thi

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
mikemccand commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005458945 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005498999 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Trie.java: ## @@ -0,0 +1,486 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

Re: [PR] Add a HNSW collector that exits early when nearest neighbor queue saturates [lucene]

2025-03-20 Thread via GitHub
tteofili commented on PR #14094: URL: https://github.com/apache/lucene/pull/14094#issuecomment-2740827581 additional experiments with different quantization levels and filtering: ## No-fitlering ### Baseline ``` recall latency(ms)nDoc topK fanout maxConn beamWidth

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006876856 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2006876856 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Optimize ParallelLeafReader to improve term vector fetching efficienc [lucene]

2025-03-20 Thread via GitHub
DivyanshIITB commented on PR #14373: URL: https://github.com/apache/lucene/pull/14373#issuecomment-2741117231 Just a gentle reminder @vigyasharma -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005865922 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] A specialized Trie for Block Tree Index [lucene]

2025-03-20 Thread via GitHub
gf2121 commented on code in PR #14333: URL: https://github.com/apache/lucene/pull/14333#discussion_r2005852050 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java: ## @@ -0,0 +1,552 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [I] build support: java 24 [lucene]

2025-03-20 Thread via GitHub
rmuir commented on issue #14379: URL: https://github.com/apache/lucene/issues/14379#issuecomment-2740938412 I tried allowing 24 and gradle only failed in the usual way (incompatible classfile): we have to wait for them to issue a gradle release that "supports 24" so they can parse the class

Re: [I] build support: java 24 [lucene]

2025-03-20 Thread via GitHub
dweiss commented on issue #14379: URL: https://github.com/apache/lucene/issues/14379#issuecomment-2740939418 https://github.com/gradle/gradle/issues/32290 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] build support: java 24 [lucene]

2025-03-20 Thread via GitHub
dweiss commented on issue #14379: URL: https://github.com/apache/lucene/issues/14379#issuecomment-2740930456 > build currently requires 23 exactly This is terrible. I'll take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] build support: java 24 [lucene]

2025-03-20 Thread via GitHub
dweiss commented on issue #14379: URL: https://github.com/apache/lucene/issues/14379#issuecomment-2740943620 They do tons of weird stuff these days that require bytecode manipulation and touching everything upon loading. I don't think there is a way around other than wait for that issue to