[GitHub] [lucene] jpountz commented on a diff in pull request #12383: Assign a dummy simScorer in TermsWeight if score is not needed

2023-06-25 Thread via GitHub
jpountz commented on code in PR #12383: URL: https://github.com/apache/lucene/pull/12383#discussion_r1241088499 ## lucene/core/src/java/org/apache/lucene/search/TermQuery.java: ## @@ -72,7 +72,16 @@ public TermWeight( if (termStats == null) { this.simScorer = nul

[GitHub] [lucene] jpountz commented on issue #12297: Unnecessary float[](BM25Scorer) allocations for non-scoring queries

2023-06-25 Thread via GitHub
jpountz commented on issue #12297: URL: https://github.com/apache/lucene/issues/12297#issuecomment-1605936169 +1 to not change the signature of `getSimilarity`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [lucene] rmuir commented on issue #12393: Can we take advantage of the Vector API for text analysis?

2023-06-25 Thread via GitHub
rmuir commented on issue #12393: URL: https://github.com/apache/lucene/issues/12393#issuecomment-1606018466 > Intuitively, this kind of workload is amenable to vectorization, could we take advantage of vectorization to speed up text analysis and thus indexing? I don't think it is: eac

[GitHub] [lucene] Deepika0510 opened a new issue, #12395: Reimplementation of Disk Usage API

2023-06-25 Thread via GitHub
Deepika0510 opened a new issue, #12395: URL: https://github.com/apache/lucene/issues/12395 ### Description There is an opportunity to improve functionality and performance of existing Disk Usage API, through a re-implementation. Currently, the best tool we have for this is base

[GitHub] [lucene] tang-hi opened a new issue, #12396: Make ForUtil Vectorized

2023-06-25 Thread via GitHub
tang-hi opened a new issue, #12396: URL: https://github.com/apache/lucene/issues/12396 ### Description Since the introduction of Vector API into Lucene via #12311, I have found it to be an interesting tool. As a result, I have attempted to use it to rewrite the [ForUtil.java](lucene

[GitHub] [lucene] msokolov commented on issue #12391: Support writes with previous major lucene versions

2023-06-25 Thread via GitHub
msokolov commented on issue #12391: URL: https://github.com/apache/lucene/issues/12391#issuecomment-1606209631 I'm confused as to why the solution described won't work with segment replication. As I understand it what that solution describes would be some writer nodes writing segments with

[GitHub] [lucene] stefanvodita opened a new pull request, #12397: Remove redundant `throws` declarations

2023-06-25 Thread via GitHub
stefanvodita opened a new pull request, #12397: URL: https://github.com/apache/lucene/pull/12397 I've noticed unused `throws` declarations multiple times and thought I'd see if they could be removed. This is the result of running IntelliJ's code inspection. There is an argument to be mad

[GitHub] [lucene] tang-hi commented on issue #12396: Make ForUtil Vectorized

2023-06-25 Thread via GitHub
tang-hi commented on issue #12396: URL: https://github.com/apache/lucene/issues/12396#issuecomment-1606538866 > I think it would be helpful if you could present the results with fewer "significant" digits; as it is it's hard to interpret. And if you are considering changing index format for

[GitHub] [lucene] uschindler commented on issue #12396: Make ForUtil Vectorized

2023-06-25 Thread via GitHub
uschindler commented on issue #12396: URL: https://github.com/apache/lucene/issues/12396#issuecomment-1606799756 Hi this is a good idea, ForUtil is next to PackedInts a good option for SIMD. If you have something implemented, open a PR, we will think of the best way ho to include it w