[PR] Remove mention of SolrNamedThreadFactory [lucene]

2024-08-26 Thread via GitHub
stefanvodita opened a new pull request, #13690: URL: https://github.com/apache/lucene/pull/13690 Looks like it was forgotten in the instructions for creating executors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
benwtrent commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731850236 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912BinaryFlatVectorsScorer.java: ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
benwtrent commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731849841 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912BinaryFlatVectorsScorer.java: ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
mayya-sharipova commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731836389 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912BinaryFlatVectorsScorer.java: ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
mayya-sharipova commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731835612 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912BinaryFlatVectorsScorer.java: ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software

Re: [PR] nocommit: demonstrate how a minor change in IndexSearcher can have an inexplicable performance impact [lucene]

2024-08-26 Thread via GitHub
epotyom commented on PR #13657: URL: https://github.com/apache/lucene/pull/13657#issuecomment-2310853821 Added `-Xlog:gc* -Xlog:age*=debug` to test command, looks like GC spends almost equal time in baseline/candidate and is not the root cause here. Baseline: ``` Statistics Ende

Re: [PR] HNSW BP reorder tool [lucene]

2024-08-26 Thread via GitHub
msokolov commented on PR #13683: URL: https://github.com/apache/lucene/pull/13683#issuecomment-2310846008 Updated to fix a few boneheaded mistakes, added support for off-heap sorting of sparse vector values. -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] HNSW BP reorder tool [lucene]

2024-08-26 Thread via GitHub
msokolov commented on PR #13683: URL: https://github.com/apache/lucene/pull/13683#issuecomment-2310838038 > I'm not the most familiar one with our vector file formats, but my understanding is that we already maintain a node ID <-> doc ID mapping, e.g. to retain dense node IDs in the case wh

Re: [I] Measure whether graph is strongly connected [lucene]

2024-08-26 Thread via GitHub
msokolov commented on issue #13687: URL: https://github.com/apache/lucene/issues/13687#issuecomment-2310826957 Thanks for opening this! Since writing that comment I implemented a different connection criterion which guarantees that every node is reachable from the entry point node in

Re: [PR] Edit HNSW API docs [lucene]

2024-08-26 Thread via GitHub
msokolov commented on PR #13688: URL: https://github.com/apache/lucene/pull/13688#issuecomment-2310795363 Thank you @pierwill! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Edit HNSW API docs [lucene]

2024-08-26 Thread via GitHub
msokolov merged PR #13688: URL: https://github.com/apache/lucene/pull/13688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
rmuir commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731609986 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ## @@ -761,4 +763,81 @@ private static int squareDistanceBody128(MemoryS

Re: [PR] Leverage doc value skip lists in DocValuesRewriteMethod if indexed [lucene]

2024-08-26 Thread via GitHub
gsmiller merged PR #13672: URL: https://github.com/apache/lucene/pull/13672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [PR] Speed up prefix sums when decoding doc IDs. [lucene]

2024-08-26 Thread via GitHub
gsmiller commented on PR #13658: URL: https://github.com/apache/lucene/pull/13658#issuecomment-2310639166 @jpountz I'm in favor of keeping the change as well (your reasoning here makes sense to me). I'll see if I can get another data point by running our internal Amazon product search bench

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
ChrisHegarty commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731499127 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ## @@ -761,4 +763,81 @@ private static int squareDistanceBody128(

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
ChrisHegarty commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731488489 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ## @@ -761,4 +763,81 @@ private static int squareDistanceBody128(

[PR] Edit HNSW API docs [lucene]

2024-08-26 Thread via GitHub
pierwill opened a new pull request, #13688: URL: https://github.com/apache/lucene/pull/13688 Makes some edits for clarity, readability, punctuation, etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
ChrisHegarty commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731488489 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ## @@ -761,4 +763,81 @@ private static int squareDistanceBody128(

[PR] Replace Map with IntObjectHashMap for DV producer [lucene]

2024-08-26 Thread via GitHub
bugmakerr opened a new pull request, #13686: URL: https://github.com/apache/lucene/pull/13686 ### Description Today, the map between field name and corresponding meta entry in DocValues Producer is represented by a `HashMap`. To reduce memory usage, we can replace it with an `IntObje

Re: [I] Add higher quantization level for kNN vector search [lucene]

2024-08-26 Thread via GitHub
mikemccand commented on issue #13650: URL: https://github.com/apache/lucene/issues/13650#issuecomment-2310385803 This sounds super promising! There is [some discussion here about PQ and PCA](https://github.com/apache/lucene/issues/13403) as well. -- This is an automated message from the

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731284306 ## lucene/core/src/test/org/apache/lucene/search/TestIndexSearcher.java: ## @@ -293,4 +297,33 @@ public void testNullExecutorNonNullTaskExecutor() { IndexSearcher

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731282471 ## lucene/core/src/test/org/apache/lucene/index/TestForTooMuchCloning.java: ## @@ -80,7 +80,7 @@ public void test() throws Exception { // System.out.println("quer

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731280867 ## lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java: ## @@ -28,17 +31,77 @@ */ public class TotalHitCountCollectorManager imple

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731275591 ## lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java: ## @@ -28,17 +31,77 @@ */ public class TotalHitCountCollectorManager imple

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731273966 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -1024,10 +1138,50 @@ public LeafSlice[] get() { leafSlices = O

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731271557 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -890,11 +945,70 @@ public static class LeafSlice { * * @lucene.experimental

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731253793 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -890,11 +945,70 @@ public static class LeafSlice { * * @lucene.experimental

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731251488 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -890,11 +945,70 @@ public static class LeafSlice { * * @lucene.experimental

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731244280 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -890,11 +945,70 @@ public static class LeafSlice { * Review Comment: I mostly had

Re: [PR] Use Max WAND optimizations with ToParentBlockJoinQuery when using ScoreMode.Max [lucene]

2024-08-26 Thread via GitHub
jpountz merged PR #13587: URL: https://github.com/apache/lucene/pull/13587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
rmuir commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731175737 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ## @@ -761,4 +763,81 @@ private static int squareDistanceBody128(MemoryS

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-08-26 Thread via GitHub
rmuir commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1731174206 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java: ## @@ -761,4 +763,81 @@ private static int squareDistanceBody128(MemoryS

Re: [PR] HNSW BP reorder tool [lucene]

2024-08-26 Thread via GitHub
jpountz commented on PR #13683: URL: https://github.com/apache/lucene/pull/13683#issuecomment-2310078679 > because of the way our vector codecs have different codecs for every possible combination of options (Byte vs Float, quantized vs not) this will be a lot of code changes I'm not

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
original-brownbear commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731131293 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -890,11 +945,70 @@ public static class LeafSlice { * * @lucene.experi

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-08-26 Thread via GitHub
original-brownbear commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1731119841 ## lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java: ## @@ -28,17 +31,77 @@ */ public class TotalHitCountCollectorManager

Re: [PR] SOLR-14370: Refactor bin/solr to allow external override of Jetty modules [lucene-solr]

2024-08-26 Thread via GitHub
epugh commented on PR #1385: URL: https://github.com/apache/lucene-solr/pull/1385#issuecomment-2309957221 @athrog is this still something you'd like to get figured out? I'd be happy to work with you to get this chased down... -- This is an automated message from the Apache Git Service.