Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14820: URL: https://github.com/apache/lucene/pull/14820#issuecomment-2990157752 If we want to run it on a CI, I think the gh action seems to be the easiest. Alternatively, running docker directly but it requires setting up git safe.directory because eclint uses git t

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
mikemccand commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2990893252 Thanks @dweiss -- I think `precommit` used to do all checks but not run tests? The [nightly benchy currently separately measures test time, from the rest of `check` time](https://ben

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
uschindler commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2990935968 I also liked "precommit" as a task name. So maybe we can keep that for convenience. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14820: URL: https://github.com/apache/lucene/pull/14820#issuecomment-2990972979 I'm guessing its not on the approved list at apache though. The action just uses their docker image. -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]

2025-06-20 Thread via GitHub
rmuir commented on code in PR #14820: URL: https://github.com/apache/lucene/pull/14820#discussion_r2158683763 ## build-tools/build-infra/src/main/groovy/lucene.regenerate.gradle: ## @@ -219,7 +219,8 @@ configure([ // Recompute checksums after the task has completed an

Re: [I] spotlessGradleScripts doesn't work with whitespace-paths on Windows [lucene]

2025-06-20 Thread via GitHub
rmuir commented on issue #14787: URL: https://github.com/apache/lucene/issues/14787#issuecomment-2990990467 Otherwise we make rust/treesitter-based one with https://github.com/tweag/topiary The groovy parser isn't the greatest, but it is doable -- This is an automated message from the

Re: [I] spotlessGradleScripts doesn't work with whitespace-paths on Windows [lucene]

2025-06-20 Thread via GitHub
dweiss commented on issue #14787: URL: https://github.com/apache/lucene/issues/14787#issuecomment-2990439462 Nice. Not too fast but works quite well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14820: URL: https://github.com/apache/lucene/pull/14820#issuecomment-2991004830 I'm gonna land this one (it was extremely annoying to get to this stage with the large DFA regeneration etc), and followup with tooling on a separate PR. If some problems sneak in between

Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]

2025-06-20 Thread via GitHub
rmuir merged PR #14820: URL: https://github.com/apache/lucene/pull/14820 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]

2025-06-20 Thread via GitHub
dweiss commented on code in PR #14820: URL: https://github.com/apache/lucene/pull/14820#discussion_r2158228546 ## build-tools/build-infra/src/main/groovy/lucene.regenerate.gradle: ## @@ -219,7 +219,8 @@ configure([ // Recompute checksums after the task has completed a

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991257003 I switched to the `check -x test` long ago because I found `precommit` confusing: these days this term indicates fast checks that run (typically only on changed files) as part of a `.git/h

[PR] Decrease TieredMergePolicy's default number of segments per tier to 8. [lucene]

2025-06-20 Thread via GitHub
jpountz opened a new pull request, #14823: URL: https://github.com/apache/lucene/pull/14823 `TieredMergePolicy` currently allows 10 segments per tier. With Lucene being increasingly deployed with separate indexing and search tiers that get updated via segment-based replication, I believe th

Re: [PR] Decrease TieredMergePolicy's default number of segments per tier to 8. [lucene]

2025-06-20 Thread via GitHub
github-actions[bot] commented on PR #14823: URL: https://github.com/apache/lucene/pull/14823#issuecomment-2991280830 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
uschindler commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991110032 Is it possible in Gradle to define a task that does "check -test"? I agree that hard coding precommit as clone of as ll check tasks without running tests is not good. But

[PR] style: disable max-line-length in editorconfig [lucene]

2025-06-20 Thread via GitHub
rmuir opened a new pull request, #14822: URL: https://github.com/apache/lucene/pull/14822 This is set for all files and causes issues where editors will wrap inconsistently with the formatter. It will happen even with java code: because spotless ignores this restriction for comments, as an

Re: [PR] style: disable max-line-length in editorconfig [lucene]

2025-06-20 Thread via GitHub
github-actions[bot] commented on PR #14822: URL: https://github.com/apache/lucene/pull/14822#issuecomment-2991155854 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
dweiss commented on code in PR #14824: URL: https://github.com/apache/lucene/pull/14824#discussion_r2159163732 ## build-tools/build-infra/src/main/java/org/apache/lucene/gradle/plugins/spotless/ParentGoogleJavaFormatTask.java: ## @@ -0,0 +1,69 @@ +package org.apache.lucene.gradl

Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2991939497 ``` > time ./gradlew clean checkGoogleJavaFormat --no-daemon real0m13.884s user1m14.370s sys 0m8.233s > time ./gradlew clean spotlessJavaCheck --no-daemon

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
mikemccand commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991458163 Thank you @dweiss for working so hard to improve our build tooling! It is indeed "cold" (tests and check -x tests is nearly the first thing nightly benchy does), but I dunno may

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991738574 I don't know what the cryptic step was needed for. Remove it. If it doesn't work, it needs to be fixed. git clean -xfd is brutal - it'll recompile everything from scratch. I think

Re: [PR] .editorconfig [lucene]

2025-06-20 Thread via GitHub
msokolov commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2991834140 > I would appreciate it if you enable emacs's support for editorconfig and let us know if it was terrible/bad; you seem worried about it for some reason. I would rather do my manu

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991372910 > Is it possible in Gradle to define a task that does "check -test"? This is exactly the problem - no, not really. You could hack around it by programmatically turning off test if '

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
mikemccand commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991624769 > > but I dunno maybe a gradle daemon or 7 is/are still running from last time so it's warm ish > > Good one. I wouldn't run with the daemon at all. By "cold" startup I mean se

Re: [PR] Make spotless gradle formatting an opt-in rather than default. [lucene]

2025-06-20 Thread via GitHub
dweiss merged PR #14821: URL: https://github.com/apache/lucene/pull/14821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Make spotless gradle formatting an opt-in rather than default. [lucene]

2025-06-20 Thread via GitHub
github-actions[bot] commented on PR #14821: URL: https://github.com/apache/lucene/pull/14821#issuecomment-2990536605 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

[PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
dweiss opened a new pull request, #14824: URL: https://github.com/apache/lucene/pull/14824 Spotless's support for google java format is kind of slow - I don't think it implements gradle's workers api so concurrency is limited to different projects. This issue aims to provide two cus

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991645292 > OK, no daemon! I think it still makes a daemon anyway, just complains, and then kills it at the end of the run :) -- This is an automated message from the Apache Git Service. To

Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992549424 Sorry for the noise, just my lack of gradle knowledge there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Decrease TieredMergePolicy's default number of segments per tier to 8. [lucene]

2025-06-20 Thread via GitHub
mikemccand commented on PR #14823: URL: https://github.com/apache/lucene/pull/14823#issuecomment-2991356685 +1, this is a great idea -- more aggessive merging by default makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] style: disable max-line-length in editorconfig [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14822: URL: https://github.com/apache/lucene/pull/14822#issuecomment-2991194959 Full list of [violations.txt](https://github.com/user-attachments/files/20835991/violations.txt) -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992145681 You could also consider cheating for the local developer. For a formatting task, it is safe to only format changed files (eg based on git status or gradle cache or whatever). For formattin

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991575632 > but I dunno maybe a gradle daemon or 7 is/are still running from last time so it's warm ish Good one. I wouldn't run with the daemon at all. By "cold" startup I mean setting thin

Re: [I] fix sources to conform to .editorconfig [lucene]

2025-06-20 Thread via GitHub
rmuir commented on issue #14819: URL: https://github.com/apache/lucene/issues/14819#issuecomment-2991947545 next step up, is the CI linter task. From the gradle side I will look to do it consistent with the ast-grep. because elements in .editorconfig, configure the editor, I think it

Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992515815 This is already part of this patch, Robert - it is incremental so it should only apply to files that have changed since the last run. I don't think it's smart enough to do ratchet-style u

Re: [PR] .editorconfig [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2991984654 Personally I think there's just too much going on here. The file was rushed in too quickly and is far too aggressive (e.g. applying too many aggressive settings to ALL files). It

Re: [I] fix sources to conform to .editorconfig [lucene]

2025-06-20 Thread via GitHub
dweiss commented on issue #14819: URL: https://github.com/apache/lucene/issues/14819#issuecomment-2990073330 > It causes the issue that when editing any impacted file, random unrelated formatting changes will appear. This is what I was worried about. -- This is an automated messag

Re: [PR] docs: fix invalid html [lucene]

2025-06-20 Thread via GitHub
github-actions[bot] commented on PR #14818: URL: https://github.com/apache/lucene/pull/14818#issuecomment-2991037224 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991087808 Precommit was manually configured to include some checks. I don't know, haven't touched it for years... For consistency, I'd move on from ant days and just use check or check -x test if y

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-06-20 Thread via GitHub
mikemccand merged PR #14178: URL: https://github.com/apache/lucene/pull/14178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Use PriorityQueue instead of TreeMap in FirstPassGroupingCollector. [lucene]

2025-06-20 Thread via GitHub
vsop-479 commented on PR #14813: URL: https://github.com/apache/lucene/pull/14813#issuecomment-2990790159 I added a new benchmark (`FirstPassGroupingBenchmark`) to simulate `FirstPassGroupingCollector` 's implementation. It seems use heap can gain a better performance: ``` Benchmark

[PR] Make spotless gradle formatting an opt-in rather than default. [lucene]

2025-06-20 Thread via GitHub
dweiss opened a new pull request, #14821: URL: https://github.com/apache/lucene/pull/14821 This is a tweak of groovy/gradle formatting check for spotless so that it does _not_ run by default. This avoids the heavy greclipse download and makes tidy run much faster. Most devs won't even touch

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-06-20 Thread via GitHub
mikemccand commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2990864157 Thank you @kaivalnp! I just merged this ... I'll try to watch builds. Let's let this bake for a week or so on `main` branch and then backport for 10.x? -- This is an automat

Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991385071 > Thanks @dweiss -- I think `precommit` used to do all checks but not run tests? The [nightly benchy currently separately measures test time, from the rest of `check` time](https://benchm

Re: [PR] .editorconfig [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2991954656 > I would rather do my manual formatting while I am working and let tidy take care of the rest. I'm from the same camp. This is one of the reasons I like gjf - I can move things ar

Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
dweiss commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992036802 Right - it's gradle.properties. I'll be experimenting and will try to push this forward - it's a promising direction to cut some time from one of the most expensive tasks at the moment.

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-06-20 Thread via GitHub
kaivalnp commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2992093576 Thank you @mikemccand! > I'll try to watch builds Here's a link for the GH action that tests this codec on a new commit / PR: https://github.com/apache/lucene/actions/workf

Re: [PR] style: disable max-line-length in editorconfig [lucene]

2025-06-20 Thread via GitHub
rmuir merged PR #14822: URL: https://github.com/apache/lucene/pull/14822 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992023661 > I had to remove `-XX:ActiveProcessorCount=1` from gradle.properties for the above. I think this setting for some reason makes gradle sluggish and unpredictable (the timings vary a lot).

Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]

2025-06-20 Thread via GitHub
dweiss commented on code in PR #14824: URL: https://github.com/apache/lucene/pull/14824#discussion_r2159174610 ## build-tools/build-infra/src/main/java/org/apache/lucene/gradle/plugins/spotless/ApplyGoogleJavaFormatTask.java: ## @@ -0,0 +1,77 @@ +package org.apache.lucene.gradle

Re: [PR] .editorconfig [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2993007936 If you don't mind, i'd like to get the CI piece working before making any big changes to this file. Then we can tone it down, fix it, etc, but changes to it will be validated rather than e

Re: [PR] Add Query for reranking KnnFloatVectorQuery with full-precision vectors [lucene]

2025-06-20 Thread via GitHub
vigyasharma commented on PR #14009: URL: https://github.com/apache/lucene/pull/14009#issuecomment-2992621644 Thanks for the explanations, @dungba88. I suppose the scenario you're trying to solve for, is when users want to change the matchset of a KnnVectorQuery using full-precision or other

Re: [PR] Add the ability to inverse a Sort [lucene]

2025-06-20 Thread via GitHub
msokolov commented on PR #14775: URL: https://github.com/apache/lucene/pull/14775#issuecomment-2992625771 Should we refuse to allow inverse of Sort.SCORE? or RELEVANCE? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] build: add eclint to verify editorconfig in CI [lucene]

2025-06-20 Thread via GitHub
rmuir commented on code in PR #14825: URL: https://github.com/apache/lucene/pull/14825#discussion_r2159848746 ## .github/actions/eclint/action.yml: ## @@ -0,0 +1,33 @@ +name: Install eclint +description: Installs eclint from cache, or builds it + +inputs: + eclint-version: +

Re: [PR] build: fix incorrect gradle dependency [lucene]

2025-06-20 Thread via GitHub
dweiss merged PR #14826: URL: https://github.com/apache/lucene/pull/14826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

[PR] build: add eclint to verify editorconfig in CI [lucene]

2025-06-20 Thread via GitHub
rmuir opened a new pull request, #14825: URL: https://github.com/apache/lucene/pull/14825 You generally shouldn't need the check, as your editor should follow the editorconfig and not introduce the errors checked-for here. But it helps maintain the .editorconfig itself and ensure that

Re: [PR] .editorconfig [lucene]

2025-06-20 Thread via GitHub
dsmiley commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2992922097 I definitely disagree on this PR being rushed. You aren't a fan of it but that doesn't make it rushed. > Trying to apply such settings to ALL files of ALL types in repo is too mu

Re: [PR] .editorconfig [lucene]

2025-06-20 Thread via GitHub
rmuir commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2992994166 I'm actually a huge fan of having editorconfig, I just know it can be tricky if you want to get it right (e.g. not have random style changes in PRs). Similar process to when we move

Re: [PR] Decrease TieredMergePolicy's default number of segments per tier to 8. [lucene]

2025-06-20 Thread via GitHub
jpountz commented on PR #14823: URL: https://github.com/apache/lucene/pull/14823#issuecomment-2993010598 Thanks @mikemccand ! I'll wait a few days before merging to give others a chance to take a look. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] build: add eclint to verify editorconfig in CI [lucene]

2025-06-20 Thread via GitHub
github-actions[bot] commented on PR #14825: URL: https://github.com/apache/lucene/pull/14825#issuecomment-2993289814 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [PR] build: add eclint to verify editorconfig in CI [lucene]

2025-06-20 Thread via GitHub
rmuir commented on code in PR #14825: URL: https://github.com/apache/lucene/pull/14825#discussion_r2159850350 ## .github/actions/eclint/action.yml: ## @@ -0,0 +1,33 @@ +name: Install eclint +description: Installs eclint from cache, or builds it + +inputs: + eclint-version: +

Re: [PR] build: fix incorrect gradle dependency [lucene]

2025-06-20 Thread via GitHub
github-actions[bot] commented on PR #14826: URL: https://github.com/apache/lucene/pull/14826#issuecomment-2993299954 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

[PR] build: fix incorrect gradle dependency [lucene]

2025-06-20 Thread via GitHub
rmuir opened a new pull request, #14826: URL: https://github.com/apache/lucene/pull/14826 applyAstGrep doesn't run in CI, because it is set to depend on "tidy" instead of "check" -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Add Query for reranking KnnFloatVectorQuery with full-precision vectors [lucene]

2025-06-20 Thread via GitHub
dungba88 commented on PR #14009: URL: https://github.com/apache/lucene/pull/14009#issuecomment-2993030559 > is when users want to change the matchset of a KnnVectorQuery using full-precision or other reranking Yes that's correct, @vigyasharma. We are using a hybrid search where KnnFl