Re: [I] gradle-wrapper.jar will not be updated when its sha/version changes [lucene]

2025-05-07 Thread via GitHub
uschindler commented on issue #14598: URL: https://github.com/apache/lucene/issues/14598#issuecomment-2858764089 Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] gradle-wrapper.jar will not be updated when its sha/version changes [lucene]

2025-05-07 Thread via GitHub
uschindler commented on issue #14598: URL: https://github.com/apache/lucene/issues/14598#issuecomment-2858772211 Thanks for also fixing linux spaces in pathname, Did you hit it on the s390x Jenkins node? There I added a whitespace in the workspace name!!! -- This is an automated message f

Re: [I] Multi-threaded vector search over multiple segments can lead to inconsistent results [lucene]

2025-05-07 Thread via GitHub
Zona-hu commented on issue #14180: URL: https://github.com/apache/lucene/issues/14180#issuecomment-2861492083 > this should be now fixed via [#14226](https://github.com/apache/lucene/pull/14226) Has the problem been resolved? Will the next version of Elasticsearch fix this issue? -

[PR] Avoid unnecessary comparison for CELL_CROSSES_QUERY cases [lucene]

2025-05-07 Thread via GitHub
jainankitk opened a new pull request, #14626: URL: https://github.com/apache/lucene/pull/14626 ### Description Probably this change is not required, as compiler might be doing this optimization implicitly. Will be great if someone can confirm the same. -- This is an automat

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
dweiss commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2858073166 If you have to force-push then you will _loose_ commit history. Force pushing is basically moving the branch's tag to a different commit instead of advancing it over new commits or merge

[I] HyphenationCompoundWordTokenFilter fixed token position and preserves original token [lucene]

2025-05-07 Thread via GitHub
jetzerv opened a new issue, #14624: URL: https://github.com/apache/lucene/issues/14624 ### Description The `HyphenationCompoundWordTokenFilter` is the recommended decompounder for Germanic languages, recommended by Elastic [Elasticsearch Docs](https://www.elastic.co/guide/en/elastics

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
gf2121 commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2858342367 Thank you for suggestions @dweiss ! I'm not sure i exactly get your point, but i think i did not make myself clear. Let me clarify a bit - By 'force-push', i was meaning that i did

[I] Backward compatibility of codecs in minor releases [lucene]

2025-05-07 Thread via GitHub
ChrisHegarty opened a new issue, #14623: URL: https://github.com/apache/lucene/issues/14623 This issue has been filed to facilitate and capture discussion relating to backward compatibility, specifically around updates to codec formats in minor releases. At Elastic we eagerly adopt a

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
gf2121 commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2858481446 My original purpose of this PR is mainly to notify reviewers these commits will be backport and allow some additional checks for this big diff. When reviewers think they are ready, i'll c

Re: [I] Backward compatibility of codec formats in minor releases [lucene]

2025-05-07 Thread via GitHub
rmuir commented on issue #14623: URL: https://github.com/apache/lucene/issues/14623#issuecomment-2858643497 I think increasing the back compat burden should be the last resort. The burden can easily hamstring the entire project: allowing a lucene version to write multiple index formats make

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
dweiss commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2858653856 We don't understand each other - it's fine to have a PR for a backport (more than fine). And it's fine to cherry pick commits against it. This isn't right though (suspicious): ```

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
dweiss commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2859090286 Exactly! This would matter more if we actually merged PRs back to dev branches... but we squash them anyway so it can be a bit messy in PR history. Thank you for understanding. -- This

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
gf2121 commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2859024991 > I personally also don't think you need to credit anybody that much when backporting... Their credit is already mentioned on the original PR and perhaps in the changes.txt file. T

Re: [I] gradle-wrapper.jar will not be updated when its sha/version changes [lucene]

2025-05-07 Thread via GitHub
dweiss commented on issue #14598: URL: https://github.com/apache/lucene/issues/14598#issuecomment-2858176432 I've noticed that Windows gradlew.bat had a subtle bug which resulted in always triggering java checksum validation. I've fixed the problem and will let chatgpt explain it in details

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
gf2121 commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2858442628 > where (m) is a merge against the target of the PR (main here). Emmm this is actually a backport PR (target at branch_10x), and all my effort was to keep the commits exactly same a

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
dweiss commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2858394922 When you force-push to a PR branch, some of that PR's commits are (or may be) gone. This effectively rewrites the history of the PR branch and comments made against those commits.

Re: [I] Backward compatibility of codec formats in minor releases [lucene]

2025-05-07 Thread via GitHub
ChrisHegarty commented on issue #14623: URL: https://github.com/apache/lucene/issues/14623#issuecomment-2859167057 > I think increasing the back compat burden should be the last resort. The burden can easily hamstring the entire project: allowing a lucene version to write multiple index for

Re: [I] gradle-wrapper.jar will not be updated when its sha/version changes [lucene]

2025-05-07 Thread via GitHub
dweiss commented on issue #14598: URL: https://github.com/apache/lucene/issues/14598#issuecomment-2858914015 :) No, I just checked if it works locally on Linux. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
dweiss commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2858950615 (Note: Please don't take this personally. I just verbalize my opinion on your PR's example but it's nothing against you.) So, here's the problem that I see with force pushes. Let's

Re: [PR] Reduce NeighborArray heap memory [lucene]

2025-05-07 Thread via GitHub
benwtrent commented on PR #14527: URL: https://github.com/apache/lucene/pull/14527#issuecomment-2859301884 > I add the MaxSizedIntArrayList class to solve oversize problem. You need one for scores as well. That `MaxSizedIntArrayList` looks good :) -- This is an automated message fro

Re: [I] Backward compatibility of codec formats in minor releases [lucene]

2025-05-07 Thread via GitHub
rmuir commented on issue #14623: URL: https://github.com/apache/lucene/issues/14623#issuecomment-2859433915 disclaimer: i'm not fully up to speed on the `DocIdSetIterator.intoBitSet` addition that motivated this discussion, but maybe one thought is that it was backported too soon? I'

Re: [PR] Reduce NeighborArray heap memory [lucene]

2025-05-07 Thread via GitHub
weizijun commented on PR #14527: URL: https://github.com/apache/lucene/pull/14527#issuecomment-2861178499 > You need one for scores as well. That `MaxSizedIntArrayList` looks good :) I didn't add the MaxSizedFloatArrayList because the int array will be passed out via the nodes() metho

Re: [I] Test testDeleteUnusedFiles() failed in TestIndexWriter [lucene]

2025-05-07 Thread via GitHub
dweiss commented on issue #11920: URL: https://github.com/apache/lucene/issues/11920#issuecomment-2861906753 > By the way, my new PC's performance is so awsome, ./gradlew check only takes 1m 42s It also shows that something has gone terribly wrong with gradle... Eh. -- This is an

Re: [PR] Avoid unnecessary comparison for CELL_CROSSES_QUERY cases [lucene]

2025-05-07 Thread via GitHub
dweiss commented on PR #14626: URL: https://github.com/apache/lucene/pull/14626#issuecomment-2861917706 I think I liked the previous version better - this wasn't unreadable at all to my eyes, typical boolean expression? -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Overrides rewrite in PointRangeQuery to optimize AllDocs/NoDocs cases [lucene]

2025-05-07 Thread via GitHub
jainankitk commented on code in PR #14609: URL: https://github.com/apache/lucene/pull/14609#discussion_r2078631302 ## lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java: ## @@ -580,4 +557,111 @@ public final String toString(String field) { * @return human rea

[PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]

2025-05-07 Thread via GitHub
jainankitk opened a new pull request, #14625: URL: https://github.com/apache/lucene/pull/14625 ### Description When lowerPoint is equal to upperPoint. In fact, there is no need to compare lowerPoint and upperPoint at the same time. The number of comparisons can be reduced by half when co

Re: [PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]

2025-05-07 Thread via GitHub
jainankitk commented on PR #14267: URL: https://github.com/apache/lucene/pull/14267#issuecomment-2860622137 Closing in favor of linked PR #14625 that addresses review comments with performance benchmark results -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]

2025-05-07 Thread via GitHub
jainankitk commented on PR #14625: URL: https://github.com/apache/lucene/pull/14625#issuecomment-2860620687 Addressed review comments, and benchmark results below: ``` TaskQPS baseline StdDevQPS my_modified_version StdDevPct d

Re: [PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]

2025-05-07 Thread via GitHub
jainankitk commented on code in PR #14267: URL: https://github.com/apache/lucene/pull/14267#discussion_r2078596167 ## lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java: ## @@ -120,381 +132,475 @@ public void visit(QueryVisitor visitor) { public final Weight c

Re: [PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]

2025-05-07 Thread via GitHub
jainankitk closed pull request #14267: Reduce the number of comparisons when lowerPoint is equal to upperPoint URL: https://github.com/apache/lucene/pull/14267 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Test testDeleteUnusedFiles() failed in TestIndexWriter [lucene]

2025-05-07 Thread via GitHub
uschindler commented on issue #11920: URL: https://github.com/apache/lucene/issues/11920#issuecomment-2860504381 Hi, Since Jenkins was updated to Windows 11 every build fails with same message. My local box is also Windows 11 and fails in same way. We should fix this. The pro

Re: [I] Test testDeleteUnusedFiles() failed in TestIndexWriter [lucene]

2025-05-07 Thread via GitHub
uschindler commented on issue #11920: URL: https://github.com/apache/lucene/issues/11920#issuecomment-2860543452 I think the problem is that Windows 11 changed semantics and allows to delete open files under some circumstances. The windows emulation is not enabled on Windows 11 so thi

Re: [I] Test testDeleteUnusedFiles() failed in TestIndexWriter [lucene]

2025-05-07 Thread via GitHub
uschindler commented on issue #11920: URL: https://github.com/apache/lucene/issues/11920#issuecomment-2860592088 On GitHub it does not fail as "windows-latedt" Tag is Windows Server 2022, which is basically Windows 10 kernel. To reproduce on GitHub we need to change job config to use

Re: [PR] Reduce NeighborArray heap memory [lucene]

2025-05-07 Thread via GitHub
weizijun commented on PR #14527: URL: https://github.com/apache/lucene/pull/14527#issuecomment-2857434676 > The underlying structures utilize `ArrayUtil.grow` to ensure capacity. This means its very easy to overshoot the maximum size. This is why I was saying we should use a new structure d

Re: [PR] Adding benchmark for histogram collector over point range query [lucene]

2025-05-07 Thread via GitHub
jainankitk commented on PR #14622: URL: https://github.com/apache/lucene/pull/14622#issuecomment-2857341736 Okay, something weird going on here. Benchmark for matchAll query is as expected: ``` Benchmark (bucketWidth) (docCount) (point

Re: [PR] Reduce NeighborArray heap memory [lucene]

2025-05-07 Thread via GitHub
weizijun commented on code in PR #14527: URL: https://github.com/apache/lucene/pull/14527#discussion_r2076987925 ## lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java: ## @@ -32,13 +33,15 @@ public class NeighborArray { private final boolean scoresDescOrder;

Re: [I] TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 [lucene]

2025-05-07 Thread via GitHub
gf2121 closed issue #14546: TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 URL: https://github.com/apache/lucene/issues/14546 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Fix tests: too many calls to IndexInput.clone during merging [lucene]

2025-05-07 Thread via GitHub
gf2121 merged PR #14595: URL: https://github.com/apache/lucene/pull/14595 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] [Backport] A specialized Trie for Block Tree Index [lucene]

2025-05-07 Thread via GitHub
gf2121 commented on PR #14563: URL: https://github.com/apache/lucene/pull/14563#issuecomment-2857677492 I have to force push to resolve conflicts and keep original commits. I plan to close this and push directly to branch_10x (no-squash) this Friday if no one beats me :) -- This is

Re: [PR] Adding benchmark for histogram collector over point range query [lucene]

2025-05-07 Thread via GitHub
jainankitk commented on PR #14622: URL: https://github.com/apache/lucene/pull/14622#issuecomment-2857368351 Okay, I was trying to use `PointRangeQuery` with `NumericDocValuesField` which is not correct. After fixing that, results are as expected: ``` Benchmark