Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-09-27 Thread via GitHub
rmuir commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2379139038 i dont think we should add methods to DataInput when we are only "trying to use elsewhere". The method is only used from one place: PostingsUtil, lets move it out -- This is an automate

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778438135 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,55 @@ private TopDocs getLeafResults( } } + private DocId

[I] tests fail due to large amount of output with highish tests.iters [lucene]

2024-09-27 Thread via GitHub
msokolov opened a new issue, #13829: URL: https://github.com/apache/lucene/issues/13829 ### Description I've noticed that some tests fail when `tests.iters` is set to a highish value (like 20) because stdout grows above the allowed threshold. I assume this is something we'd want to f

Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-09-27 Thread via GitHub
dweiss commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2379761858 Folks, if you'd like to do anything larger to this, please go ahead. I'm with very limited access to the internet this week and I won't be able to track your comments closely or make any

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778462654 ## lucene/core/src/test/org/apache/lucene/document/TestManyKnnDocs.java: ## @@ -46,27 +54,139 @@ public void testLargeSegment() throws Exception { mp.setMaxMer

Re: [PR] Compute multiple float aggregations in one go [lucene]

2024-09-27 Thread via GitHub
stefanvodita commented on PR #12547: URL: https://github.com/apache/lucene/pull/12547#issuecomment-237933 I'm not sure either. Since the new aggregation engine is in sandbox, it makes sense to keep developing the old aggregation engine. On the other hand, that's not very productive if w

Re: [PR] Compute multiple float aggregations in one go [lucene]

2024-09-27 Thread via GitHub
gsmiller commented on PR #12547: URL: https://github.com/apache/lucene/pull/12547#issuecomment-2379648625 @stefanvodita do you think this change is still worth moving forward after the new sandbox faceting implementation was added? Being able to compute an arbitrary number of aggregations i

Re: [PR] Fixed bit set vector [lucene]

2024-09-27 Thread via GitHub
benwtrent commented on PR #13827: URL: https://github.com/apache/lucene/pull/13827#issuecomment-2379478454 I wonder if this is much faster than auto-vectorization provided by the JVM on AVX256 & 512 ARM does have an issue with vectorizing long values. -- This is an automated messag

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
benwtrent commented on PR #13635: URL: https://github.com/apache/lucene/pull/13635#issuecomment-2379026129 Thinking more and more, I do not like the idea of adding to the leaf function definition. But, this does seem useful. I think we can attach it to kNN collectors. I don't

Re: [I] tests fail due to large amount of output with highish tests.iters [lucene]

2024-09-27 Thread via GitHub
msokolov commented on issue #13829: URL: https://github.com/apache/lucene/issues/13829#issuecomment-2379179445 Example failing test output: ``` org.apache.lucene.demo.TestDemo.classMethod (:lucene:demo)

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778417770 ## lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java: ## @@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778399477 ## lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java: ## @@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {

Re: [I] tests fail due to large amount of output with highish tests.iters [lucene]

2024-09-27 Thread via GitHub
msokolov commented on issue #13829: URL: https://github.com/apache/lucene/issues/13829#issuecomment-2379202588 There is something about the way `TestDocInverterPerFieldErrorInfo.testNoExtraNoise` captures the info-stream that causes it to always fail when `test.iters` > 1. This whole situa

Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-09-27 Thread via GitHub
easyice commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2379114801 > the current readGroupVInts is also pretty specific to what the postings are doing. And it just calls readVint() and readGroupVint() behind the scenes. Seems like a good candidate to mo

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778468665 ## lucene/core/src/test/org/apache/lucene/document/TestManyKnnDocs.java: ## @@ -46,27 +54,139 @@ public void testLargeSegment() throws Exception { mp.setMaxMer

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778463376 ## lucene/core/src/test/org/apache/lucene/document/TestManyKnnDocs.java: ## @@ -46,27 +54,139 @@ public void testLargeSegment() throws Exception { mp.setMaxMer

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778453629 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,49 @@ private TopDocs getLeafResults( } } + private DocId

Re: [PR] Reduce long[] array allocation for bitset in readBitSetIterator [lucene]

2024-09-27 Thread via GitHub
easyice commented on code in PR #13828: URL: https://github.com/apache/lucene/pull/13828#discussion_r1778298810 ## lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java: ## @@ -205,12 +208,16 @@ void readInts(IndexInput in, int count, int[] docIDs) throws IOExceptio

Re: [PR] Drop final modifier on the public DataInput.readGroupVInts method [lucene]

2024-09-27 Thread via GitHub
rmuir commented on PR #13825: URL: https://github.com/apache/lucene/pull/13825#issuecomment-2379010887 the current readGroupVInts is also pretty specific to what the postings are doing. And it just calls readVint() and readGroupVint() behind the scenes. Seems like a good candidate to move t

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778436381 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,55 @@ private TopDocs getLeafResults( } } + private DocId

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778422918 ## lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java: ## @@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778419593 ## lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java: ## @@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778412763 ## lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java: ## @@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
seanmacavaney commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778379250 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -283,11 +289,17 @@ public void search(String

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778218742 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,55 @@ private TopDocs getLeafResults( } } + private DocId

Re: [PR] Reduce long[] array allocation for bitset in readBitSetIterator [lucene]

2024-09-27 Thread via GitHub
easyice commented on code in PR #13828: URL: https://github.com/apache/lucene/pull/13828#discussion_r1778298810 ## lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java: ## @@ -205,12 +208,16 @@ void readInts(IndexInput in, int count, int[] docIDs) throws IOExceptio

Re: [PR] Reduce long[] array allocation for bitset in readBitSetIterator [lucene]

2024-09-27 Thread via GitHub
easyice commented on code in PR #13828: URL: https://github.com/apache/lucene/pull/13828#discussion_r1778291944 ## lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java: ## @@ -205,12 +208,16 @@ void readInts(IndexInput in, int count, int[] docIDs) throws IOExceptio

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778283453 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/MismatchedLeafReader.java: ## @@ -68,6 +71,28 @@ public CacheHelper getReaderCacheHelper() { re

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778255372 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java: ## @@ -247,7 +248,12 @@ public ByteVectorValues getByteVectorValues(String

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778254222 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -283,11 +289,17 @@ public void search(String fie

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778226954 ## lucene/core/src/java/org/apache/lucene/search/KnnByteVectorQuery.java: ## @@ -133,4 +149,18 @@ public int hashCode() { public byte[] getTargetCopy() { re

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778226954 ## lucene/core/src/java/org/apache/lucene/search/KnnByteVectorQuery.java: ## @@ -133,4 +149,18 @@ public int hashCode() { public byte[] getTargetCopy() { re

Re: [PR] Reduce long[] array allocation for bitset in readBitSetIterator [lucene]

2024-09-27 Thread via GitHub
gf2121 commented on code in PR #13828: URL: https://github.com/apache/lucene/pull/13828#discussion_r1778204945 ## lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java: ## @@ -36,6 +38,7 @@ final class DocIdsWriter { private static final byte LEGACY_DELTA_VINT = (

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778221739 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,55 @@ private TopDocs getLeafResults( } } + private DocId

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778215992 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,55 @@ private TopDocs getLeafResults( } } + private DocId

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-09-27 Thread via GitHub
cpoerschke commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1778215267 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -156,6 +196,55 @@ private TopDocs getLeafResults( } } + private DocId