[GitHub] [lucene] zhaih commented on issue #12358: Optimize `count()` for BooleanQuery disjunction

2023-06-16 Thread via GitHub
zhaih commented on issue #12358: URL: https://github.com/apache/lucene/issues/12358#issuecomment-1595284895 After roughly catching up with the threads, I would like to go back to > +1 to update the Weight#count API contract or introduce a new API to better optimize counting. So we

[GitHub] [lucene] almogtavor commented on issue #12318: Async Usage of Lucene Monitor through a Reactive Programming based application

2023-06-16 Thread via GitHub
almogtavor commented on issue #12318: URL: https://github.com/apache/lucene/issues/12318#issuecomment-1595104039 @romseygeek Oh so that sounds even better than what I thought. In the case of `ByteBuffersDirectory` & MMap, I can treat the match operation like a total sync operation and use i

[GitHub] [lucene] jainankitk commented on issue #12297: Unnecessary float[](BM25Scorer) allocations for non-scoring queries

2023-06-16 Thread via GitHub
jainankitk commented on issue #12297: URL: https://github.com/apache/lucene/issues/12297#issuecomment-1595083334 > I agree with Robert that 1kB per segment doesn't sound like a crazy amount of allocations, which suggests that you are searching many segments. Does the memory allocation profi

[GitHub] [lucene] zhaih commented on pull request #12371: [Draft] #12236 Lazily compute similarity score

2023-06-16 Thread via GitHub
zhaih commented on PR #12371: URL: https://github.com/apache/lucene/pull/12371#issuecomment-1595003976 Thank you @Jackyrie2 for working on it, I think @benwtrent 's concern about memory makes sense but seems to me we should be able to reduce the memory usage later on (as this is still a dra

[GitHub] [lucene] Jackyrie2 commented on pull request #12371: [Draft] #12236 Lazily compute similarity score

2023-06-16 Thread via GitHub
Jackyrie2 commented on PR #12371: URL: https://github.com/apache/lucene/pull/12371#issuecomment-1594985039 @benwtrent thanks for your suggestion, I will take some time this weekend to try some benchmarking. I have some benchmark ideas in mind that might show some improvements. -- This is

[GitHub] [lucene] zhaih commented on pull request #12372: Reuse neighborqueue during hnsw index build (attempt 2)

2023-06-16 Thread via GitHub
zhaih commented on PR #12372: URL: https://github.com/apache/lucene/pull/12372#issuecomment-1594967308 > Not sure how that's related to this code or how to fix it I think this is due to newly cut 9.7 branch so we probably just need to wait a bit more. I see all the auto testing is com

[GitHub] [lucene] jbellis commented on pull request #12372: Reuse neighborqueue during hnsw index build (attempt 2)

2023-06-16 Thread via GitHub
jbellis commented on PR #12372: URL: https://github.com/apache/lucene/pull/12372#issuecomment-1594923701 CI says this test is failing ```org.apache.lucene.backward_index.TestBackwardsCompatibility > test suite's output saved to /home/runner/work/lucene/lucene/lucene/backward-codecs/b

[GitHub] [lucene] jbellis commented on a diff in pull request #12372: Reuse neighborqueue during hnsw index build (attempt 2)

2023-06-16 Thread via GitHub
jbellis commented on code in PR #12372: URL: https://github.com/apache/lucene/pull/12372#discussion_r1232415072 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -262,7 +274,6 @@ private NeighborQueue searchLevel( int visitedLimit) t

[GitHub] [lucene] sohami commented on pull request #12374: Provide constructor to accept the LeafSlice computed by extensions

2023-06-16 Thread via GitHub
sohami commented on PR #12374: URL: https://github.com/apache/lucene/pull/12374#issuecomment-1594871500 @javanna I am also thinking with regards to too many constructor can we see if deprecating the ones with IndexReader as inputs and keeping the ones with IndexReaderContext over the time w

[GitHub] [lucene] benwtrent commented on a diff in pull request #12372: Reuse neighborqueue during hnsw index build (attempt 2)

2023-06-16 Thread via GitHub
benwtrent commented on code in PR #12372: URL: https://github.com/apache/lucene/pull/12372#discussion_r1232396894 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -262,7 +274,6 @@ private NeighborQueue searchLevel( int visitedLimit)

[GitHub] [lucene] jbellis commented on pull request #12372: Reuse neighborqueue during hnsw index build (attempt 2)

2023-06-16 Thread via GitHub
jbellis commented on PR #12372: URL: https://github.com/apache/lucene/pull/12372#issuecomment-1594763703 I'm using the million-row sift dataset via this harness https://github.com/jbellis/hnswdemo/tree/benchmarking I believe what is happening is that allocation is basically free and t

[GitHub] [lucene] benwtrent commented on pull request #12371: [Draft] #12236 Lazily compute similarity score

2023-06-16 Thread via GitHub
benwtrent commented on PR #12371: URL: https://github.com/apache/lucene/pull/12371#issuecomment-1594666777 I ran this change with https://github.com/mikemccand/luceneutil with knnPerfTest with over 1 vectors from `enwiki-20120502-lines-1k-300d.vec` I tried maxConn 32 & 96 and bea

[GitHub] [lucene] benwtrent commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores

2023-06-16 Thread via GitHub
benwtrent commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1594648943 > And then switch the score to scale in 10 as a breaking change. Or is condoning negative scores under any circumstances a non-starter? If you are utilizing hybrid search, n

[GitHub] [lucene] benwtrent commented on pull request #12372: Reuse neighborqueue during hnsw index build (attempt 2)

2023-06-16 Thread via GitHub
benwtrent commented on PR #12372: URL: https://github.com/apache/lucene/pull/12372#issuecomment-1594642822 Hey @jbellis the change looks nice to me. But, I ran https://github.com/mikemccand/luceneutil `knnPerTest` and saw no change at all in indexing time. Am I missing something? Cou

[GitHub] [lucene] jbellis commented on pull request #12372: Reuse neighborqueue during hnsw index build (attempt 2)

2023-06-16 Thread via GitHub
jbellis commented on PR #12372: URL: https://github.com/apache/lucene/pull/12372#issuecomment-1594627724 cc @msokolov @benwtrent @zhaih -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [lucene] uschindler merged pull request #12376: Allow VectorUtilProvider tests to be executed although hardware may not fully support vectorization or if C2 is not enabled

2023-06-16 Thread via GitHub
uschindler merged PR #12376: URL: https://github.com/apache/lucene/pull/12376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.