Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]
dweiss commented on PR #14820: URL: https://github.com/apache/lucene/pull/14820#issuecomment-2990157752 If we want to run it on a CI, I think the gh action seems to be the easiest. Alternatively, running docker directly but it requires setting up git safe.directory because eclint uses git to list the files to validate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
mikemccand commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2990893252 Thanks @dweiss -- I think `precommit` used to do all checks but not run tests? The [nightly benchy currently separately measures test time, from the rest of `check` time](https://benchmarks.mikemccandless.com/antcleantest.html) (whoa, `precommit` time is ~6 minutes, and test time is ~1.2 minutes). Oh it looks like I can maybe do `./gradlew check -x test`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
uschindler commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2990935968 I also liked "precommit" as a task name. So maybe we can keep that for convenience. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]
rmuir commented on PR #14820: URL: https://github.com/apache/lucene/pull/14820#issuecomment-2990972979 I'm guessing its not on the approved list at apache though. The action just uses their docker image. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]
rmuir commented on code in PR #14820: URL: https://github.com/apache/lucene/pull/14820#discussion_r2158683763 ## build-tools/build-infra/src/main/groovy/lucene.regenerate.gradle: ## @@ -219,7 +219,8 @@ configure([ // Recompute checksums after the task has completed and write them. def updatedChecksums = computeChecksummedEntries(sourceTask) checksumsFile.setText( - JsonOutput.prettyPrint(JsonOutput.toJson(new TreeMap(updatedChecksums))), "UTF-8") + JsonOutput.prettyPrint(JsonOutput.toJson(new TreeMap(updatedChecksums))) + '\n', "UTF-8") Review Comment: This is why I think it is bad to have inconsistent max_line_length set in the editorconfig. Any time I edit gradle you are going to see a change like this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] spotlessGradleScripts doesn't work with whitespace-paths on Windows [lucene]
rmuir commented on issue #14787: URL: https://github.com/apache/lucene/issues/14787#issuecomment-2990990467 Otherwise we make rust/treesitter-based one with https://github.com/tweag/topiary The groovy parser isn't the greatest, but it is doable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] spotlessGradleScripts doesn't work with whitespace-paths on Windows [lucene]
dweiss commented on issue #14787: URL: https://github.com/apache/lucene/issues/14787#issuecomment-2990439462 Nice. Not too fast but works quite well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]
rmuir commented on PR #14820: URL: https://github.com/apache/lucene/pull/14820#issuecomment-2991004830 I'm gonna land this one (it was extremely annoying to get to this stage with the large DFA regeneration etc), and followup with tooling on a separate PR. If some problems sneak in between those times, I'll fix them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]
rmuir merged PR #14820: URL: https://github.com/apache/lucene/pull/14820 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] style: fix sources to conform to .editorconfig (basic fixes only) [lucene]
dweiss commented on code in PR #14820: URL: https://github.com/apache/lucene/pull/14820#discussion_r2158228546 ## build-tools/build-infra/src/main/groovy/lucene.regenerate.gradle: ## @@ -219,7 +219,8 @@ configure([ // Recompute checksums after the task has completed and write them. def updatedChecksums = computeChecksummedEntries(sourceTask) checksumsFile.setText( - JsonOutput.prettyPrint(JsonOutput.toJson(new TreeMap(updatedChecksums))), "UTF-8") + JsonOutput.prettyPrint(JsonOutput.toJson(new TreeMap(updatedChecksums))) + '\n', "UTF-8") Review Comment: This is mildly surprising. Passes with greclipse and both versions are "ok" for it but I wonder if conflicts between the rules for different systems (editorconfig, spotless) won't cause headaches. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
rmuir commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991257003 I switched to the `check -x test` long ago because I found `precommit` confusing: these days this term indicates fast checks that run (typically only on changed files) as part of a `.git/hooks/pre-commit` hook. They aren't supposed to take minutes. I don't use them personally but manage them for others. https://pre-commit.com/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Decrease TieredMergePolicy's default number of segments per tier to 8. [lucene]
jpountz opened a new pull request, #14823: URL: https://github.com/apache/lucene/pull/14823 `TieredMergePolicy` currently allows 10 segments per tier. With Lucene being increasingly deployed with separate indexing and search tiers that get updated via segment-based replication, I believe that it would make sense for Lucene to have more aggressive merging defaults, a price that is only paid once on the indexing tier, but that benefits all search nodes that serve queries for this index. Note that this is still a somewhat conservative default, applications with low latency requirements and low update rates will likely want to go even further, with 4 segments per tier, or even 2. `BaseMergePolicyTestCase#testSimulateAppendOnly` reports a write amplification increase from 3.4 to 3.8, while `BaseMergePolicyTestCase#testSimulateUpdates` reports a write amplification increase from 4.5 to 4.9. In exchange, the number of segments between the floor and max segment sizes decreases by about 20%. This should especially help queries that have a high per-segment overhead: PK lookups, point queries, multi-term queries and vector searches. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Decrease TieredMergePolicy's default number of segments per tier to 8. [lucene]
github-actions[bot] commented on PR #14823: URL: https://github.com/apache/lucene/pull/14823#issuecomment-2991280830 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
uschindler commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991110032 Is it possible in Gradle to define a task that does "check -test"? I agree that hard coding precommit as clone of as ll check tasks without running tests is not good. But the precommit was always an easy fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] style: disable max-line-length in editorconfig [lucene]
rmuir opened a new pull request, #14822: URL: https://github.com/apache/lucene/pull/14822 This is set for all files and causes issues where editors will wrap inconsistently with the formatter. It will happen even with java code: because spotless ignores this restriction for comments, as an example. Best to leave such settings to tools like spotless which are AST-aware. Relates: #14819 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] style: disable max-line-length in editorconfig [lucene]
github-actions[bot] commented on PR #14822: URL: https://github.com/apache/lucene/pull/14822#issuecomment-2991155854 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
dweiss commented on code in PR #14824: URL: https://github.com/apache/lucene/pull/14824#discussion_r2159163732 ## build-tools/build-infra/src/main/java/org/apache/lucene/gradle/plugins/spotless/ParentGoogleJavaFormatTask.java: ## @@ -0,0 +1,69 @@ +package org.apache.lucene.gradle.plugins.spotless; + +import com.google.googlejavaformat.java.Formatter; +import com.google.googlejavaformat.java.JavaFormatterOptions; +import org.gradle.api.DefaultTask; +import org.gradle.api.file.ConfigurableFileCollection; +import org.gradle.api.file.FileType; +import org.gradle.api.file.ProjectLayout; +import org.gradle.api.file.RegularFileProperty; +import org.gradle.api.tasks.InputFiles; +import org.gradle.api.tasks.Internal; +import org.gradle.api.tasks.OutputFile; +import org.gradle.api.tasks.PathSensitive; +import org.gradle.api.tasks.PathSensitivity; +import org.gradle.work.ChangeType; +import org.gradle.work.FileChange; +import org.gradle.work.Incremental; +import org.gradle.work.InputChanges; +import org.gradle.workers.WorkQueue; +import org.gradle.workers.WorkerExecutor; + +import javax.inject.Inject; +import java.io.File; +import java.util.List; +import java.util.stream.StreamSupport; + +abstract class ParentGoogleJavaFormatTask extends DefaultTask { +@Incremental +@InputFiles +@PathSensitive(PathSensitivity.RELATIVE) +public abstract ConfigurableFileCollection getSourceFiles(); + +@OutputFile +public abstract RegularFileProperty getOutputChangeListFile(); + +@Inject +protected abstract WorkerExecutor getWorkerExecutor(); + +public ParentGoogleJavaFormatTask(ProjectLayout layout, String gjfTask) { + getOutputChangeListFile().convention(layout.getBuildDirectory().file("gjf-" + gjfTask + ".txt")); +} + +protected static Formatter getFormatter() { +JavaFormatterOptions options = +JavaFormatterOptions.builder() +.style(JavaFormatterOptions.Style.GOOGLE) +.formatJavadoc(true) +.reorderModifiers(true) +.build(); +return new Formatter(options); +} + +protected List getIncrementalBatch(InputChanges inputChanges) { +return StreamSupport.stream(inputChanges.getFileChanges(getSourceFiles()).spliterator(), false) +.filter(fileChange -> { +return fileChange.getFileType() == FileType.FILE && + (fileChange.getChangeType() == ChangeType.ADDED || +fileChange.getChangeType() == ChangeType.MODIFIED); +}) +.map(FileChange::getFile) +.toList(); +} + +@Internal +protected WorkQueue getWorkQueue() { +// TODO: maybe fork a separate jvm so that we can pass open-module settings there and fine-tune the jvm for the task? +return getWorkerExecutor().noIsolation(); +} Review Comment: I'd love to fork a separate jvm here - where we can set some options that open up modules, etc. The problem is - there is no way to control how many isolated processes this forks. On high-end multi-core machines I see dozens of forked java processes - this doesn't make sense and is super heavy. Is there a way to control the concurrency of the isolated worker executor, @breskeby ? ## build-tools/build-infra/src/main/java/org/apache/lucene/gradle/plugins/spotless/CheckGoogleJavaFormatTask.java: ## @@ -0,0 +1,195 @@ +package org.apache.lucene.gradle.plugins.spotless; + +import com.github.difflib.text.DiffRow; +import com.github.difflib.text.DiffRowGenerator; +import com.google.googlejavaformat.java.FormatterException; +import com.google.googlejavaformat.java.ImportOrderer; +import com.google.googlejavaformat.java.JavaFormatterOptions; +import com.google.googlejavaformat.java.RemoveUnusedImports; +import java.io.File; +import java.io.IOException; +import java.io.UncheckedIOException; +import java.io.Writer; +import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import javax.inject.Inject; +import org.gradle.api.GradleException; +import org.gradle.api.file.ConfigurableFileCollection; +import org.gradle.api.file.FileSystemOperations; +import org.gradle.api.file.ProjectLayout; +import org.gradle.api.file.RegularFileProperty; +import org.gradle.api.tasks.TaskAction; +import org.gradle.work.DisableCachingByDefault; +import org.gradle.work.InputChanges; +import org.gradle.workers.WorkAction; +import org.gradle.workers.WorkParameters; +import org.gradle.workers.WorkQueue; + +@DisableCachingByDefault +public abstract class CheckGoogleJavaFormatTask extends ParentGoogleJavaFormatTask { + + @Inject + protected abstract FileSystemOperations getFilesystemOperations(); + + @Inject + public CheckGoogleJavaF
Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
dweiss commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2991939497 ``` > time ./gradlew clean checkGoogleJavaFormat --no-daemon real0m13.884s user1m14.370s sys 0m8.233s > time ./gradlew clean spotlessJavaCheck --no-daemon real0m29.668s user1m27.121s sys 0m8.733s ``` I had to remove ```-XX:ActiveProcessorCount=1``` from gradle.properties for the above. I think this setting for some reason makes gradle sluggish and unpredictable (the timings vary a lot). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
mikemccand commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991458163 Thank you @dweiss for working so hard to improve our build tooling! It is indeed "cold" (tests and check -x tests is nearly the first thing nightly benchy does), but I dunno maybe a gradle daemon or 7 is/are still running from last time so it's warm ish? I think I got nightly benchy started again (one off) just now after switching to `./gradlew check -x test` -- let's see how the timings look in ~14 hours or so. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991738574 I don't know what the cryptic step was needed for. Remove it. If it doesn't work, it needs to be fixed. git clean -xfd is brutal - it'll recompile everything from scratch. I think it has benefits for the benchmark though, so I'd leave it in. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] .editorconfig [lucene]
msokolov commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2991834140 > I would appreciate it if you enable emacs's support for editorconfig and let us know if it was terrible/bad; you seem worried about it for some reason. I would rather do my manual formatting while I am working and let tidy take care of the rest. Or is this covering something tidy does not? If it is, could tidy be made to handle? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991372910 > Is it possible in Gradle to define a task that does "check -test"? This is exactly the problem - no, not really. You could hack around it by programmatically turning off test if 'precommit' is on start args... but it is really an ugly workaround for something that isn't a problem to me. You should run your tests before you commit so 'gradlew check' is shorter, really. If you insist on running without tests, exclude the ones you don't want. Also, this is per-project so ``` ./gradlew -p lucene/core check -x test ``` will run all checks only for the lucene core - precommit was global (involved every project). > But the precommit was always an easy shortcut. len("precommit") > len("check") ? :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
mikemccand commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991624769 > > but I dunno maybe a gradle daemon or 7 is/are still running from last time so it's warm ish > > Good one. I wouldn't run with the daemon at all. By "cold" startup I mean setting things up - downloading deps, compiling the buid-infra scripts, etc. These are currently slow like hell. Oh hmm ... no, actually. It reuses the same checkout in the local (NVMe SSD) filesystem, does a `git pull`, a `git clean -xfd`, hmm then this cryptic step (maybe not needed anymore!!): ``` # There is some weird gradle bootstrapping bug: if we do not run this "help" first, then the test run fails w/ cryptic error: os.system("./gradlew help >> %s.tmp 2>&1" % logFile) ``` Then runs tests with `./gradlew --no-daemon -Ptask.times=true -p lucene test >> %s.tmp 2>&1` -- OK, no daemon! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Make spotless gradle formatting an opt-in rather than default. [lucene]
dweiss merged PR #14821: URL: https://github.com/apache/lucene/pull/14821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Make spotless gradle formatting an opt-in rather than default. [lucene]
github-actions[bot] commented on PR #14821: URL: https://github.com/apache/lucene/pull/14821#issuecomment-2990536605 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
dweiss opened a new pull request, #14824: URL: https://github.com/apache/lucene/pull/14824 Spotless's support for google java format is kind of slow - I don't think it implements gradle's workers api so concurrency is limited to different projects. This issue aims to provide two custom tasks for applying and checking google java format, then use them as a substitute for those found in spotless. This is in draft state and not fully working but it seems faster for me already (although there are open questions I've been struggling with). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
rmuir commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991645292 > OK, no daemon! I think it still makes a daemon anyway, just complains, and then kills it at the end of the run :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
rmuir commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992549424 Sorry for the noise, just my lack of gradle knowledge there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Decrease TieredMergePolicy's default number of segments per tier to 8. [lucene]
mikemccand commented on PR #14823: URL: https://github.com/apache/lucene/pull/14823#issuecomment-2991356685 +1, this is a great idea -- more aggessive merging by default makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] style: disable max-line-length in editorconfig [lucene]
rmuir commented on PR #14822: URL: https://github.com/apache/lucene/pull/14822#issuecomment-2991194959 Full list of [violations.txt](https://github.com/user-attachments/files/20835991/violations.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
rmuir commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992145681 You could also consider cheating for the local developer. For a formatting task, it is safe to only format changed files (eg based on git status or gradle cache or whatever). For formatting there aren't dependencies between files. As long as user can do a `CI=true ./gradlew tidy` to force it happening the slow way, corner cases such as formatter upgrades can all be handled. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991575632 > but I dunno maybe a gradle daemon or 7 is/are still running from last time so it's warm ish Good one. I wouldn't run with the daemon at all. By "cold" startup I mean setting things up - downloading deps, compiling the buid-infra scripts, etc. These are currently slow like hell. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] fix sources to conform to .editorconfig [lucene]
rmuir commented on issue #14819: URL: https://github.com/apache/lucene/issues/14819#issuecomment-2991947545 next step up, is the CI linter task. From the gradle side I will look to do it consistent with the ast-grep. because elements in .editorconfig, configure the editor, I think its ok to just invoke it in CI (or if the user opts-in via a property). Newly introduced problems going forwards should be rare. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
dweiss commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992515815 This is already part of this patch, Robert - it is incremental so it should only apply to files that have changed since the last run. I don't think it's smart enough to do ratchet-style updates based on git history or anything like that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] .editorconfig [lucene]
rmuir commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2991984654 Personally I think there's just too much going on here. The file was rushed in too quickly and is far too aggressive (e.g. applying too many aggressive settings to ALL files). It is better to proceed with caution around this file. An example would be the python file deleted here: https://github.com/apache/lucene/blob/f83587416ecaa7b5fb201dd3938fd06e2af95aba/dev-tools/scripts/.editorconfig Because there was inconsistent formatting of the python files in that folder, it was introduced JUST for very specific filetype in JUST that specific folder. Trying to apply such settings to ALL files of ALL types in repo is too much, you need to deal with wacky file types (e.g. make with tabs), windows file types that should have CRLF, generated files, test files, files that aren't valid utf-8, and the list goes on and on. I'm trying to cope regardless, but it isn't easy and I still think there's too much going on here. See #14819 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] fix sources to conform to .editorconfig [lucene]
dweiss commented on issue #14819: URL: https://github.com/apache/lucene/issues/14819#issuecomment-2990073330 > It causes the issue that when editing any impacted file, random unrelated formatting changes will appear. This is what I was worried about. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] docs: fix invalid html [lucene]
github-actions[bot] commented on PR #14818: URL: https://github.com/apache/lucene/pull/14818#issuecomment-2991037224 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991087808 Precommit was manually configured to include some checks. I don't know, haven't touched it for years... For consistency, I'd move on from ant days and just use check or check -x test if you want to run without tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add a Faiss codec for KNN searches [lucene]
mikemccand merged PR #14178: URL: https://github.com/apache/lucene/pull/14178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use PriorityQueue instead of TreeMap in FirstPassGroupingCollector. [lucene]
vsop-479 commented on PR #14813: URL: https://github.com/apache/lucene/pull/14813#issuecomment-2990790159 I added a new benchmark (`FirstPassGroupingBenchmark`) to simulate `FirstPassGroupingCollector` 's implementation. It seems use heap can gain a better performance: ``` Benchmark Mode Cnt Score Error Units FirstPassGroupingBenchmark.getTopNWithHeap thrpt 15 0.010 ± 0.002 ops/us FirstPassGroupingBenchmark.getTopNWithTreeMap thrpt 15 0.005 ± 0.001 ops/us ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Make spotless gradle formatting an opt-in rather than default. [lucene]
dweiss opened a new pull request, #14821: URL: https://github.com/apache/lucene/pull/14821 This is a tweak of groovy/gradle formatting check for spotless so that it does _not_ run by default. This avoids the heavy greclipse download and makes tidy run much faster. Most devs won't even touch build files - I don't think we all need to suffer here. I'll just add a gh workflow to check the formatting periodically and I'll apply the formatting if it fails. Anybody willing to suffer (heh) can turn it on locally by setting the corresponding build option ```lucene.spotlessGradleScripts``` to true in gradle.properties or their local build option file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add a Faiss codec for KNN searches [lucene]
mikemccand commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2990864157 Thank you @kaivalnp! I just merged this ... I'll try to watch builds. Let's let this bake for a week or so on `main` branch and then backport for 10.x? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]
dweiss commented on PR #14764: URL: https://github.com/apache/lucene/pull/14764#issuecomment-2991385071 > Thanks @dweiss -- I think `precommit` used to do all checks but not run tests? The [nightly benchy currently separately measures test time, from the rest of `check` time](https://benchmarks.mikemccandless.com/antcleantest.html) (whoa, `precommit` time is ~6 minutes, and test time is ~1.2 minutes). Yes, this is an abomination. This benchmark is from a cold start, I think? I've been working to speed things up. We'll see how it goes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] .editorconfig [lucene]
dweiss commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2991954656 > I would rather do my manual formatting while I am working and let tidy take care of the rest. I'm from the same camp. This is one of the reasons I like gjf - I can move things around and they half-magically revert back to the previous state. No need to think about formatting that much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
dweiss commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992036802 Right - it's gradle.properties. I'll be experimenting and will try to push this forward - it's a promising direction to cut some time from one of the most expensive tasks at the moment.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add a Faiss codec for KNN searches [lucene]
kaivalnp commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2992093576 Thank you @mikemccand! > I'll try to watch builds Here's a link for the GH action that tests this codec on a new commit / PR: https://github.com/apache/lucene/actions/workflows/run-special-checks-sandbox.yml > backport for 10.x? Exciting! I can try to create a backport PR.. [10.x](https://github.com/apache/lucene/tree/branch_10x) seems to be on JDK21 where the FFM API was in preview (https://openjdk.org/jeps/442) and was subsequently finalized in JDK22 (https://openjdk.org/jeps/454), I hope we don't need too many changes :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] style: disable max-line-length in editorconfig [lucene]
rmuir merged PR #14822: URL: https://github.com/apache/lucene/pull/14822 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
rmuir commented on PR #14824: URL: https://github.com/apache/lucene/pull/14824#issuecomment-2992023661 > I had to remove `-XX:ActiveProcessorCount=1` from gradle.properties for the above. I think this setting for some reason makes gradle sluggish and unpredictable (the timings vary a lot). For just the `org.gradle.jvmargs`, correct? I think it makes sense to nuke it here. It makes better sense for forked processes (e.g. `tests.jvmargs`), where, on an 8 processor machine, you want to avoid fork()'ing 8 JVMS, each of which is sized for 8 CPU machine (8 * 8 = 64). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial prototype of custom google java format tasks to replace spotless [lucene]
dweiss commented on code in PR #14824: URL: https://github.com/apache/lucene/pull/14824#discussion_r2159174610 ## build-tools/build-infra/src/main/java/org/apache/lucene/gradle/plugins/spotless/ApplyGoogleJavaFormatTask.java: ## @@ -0,0 +1,77 @@ +package org.apache.lucene.gradle.plugins.spotless; + +import com.google.googlejavaformat.java.FormatterException; +import org.gradle.api.file.ConfigurableFileCollection; +import org.gradle.api.file.ProjectLayout; +import org.gradle.api.tasks.TaskAction; +import org.gradle.work.DisableCachingByDefault; +import org.gradle.work.InputChanges; +import org.gradle.workers.WorkAction; +import org.gradle.workers.WorkParameters; +import org.gradle.workers.WorkQueue; + +import javax.inject.Inject; +import java.io.File; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.List; +import java.util.stream.Collectors; + +@DisableCachingByDefault +public abstract class ApplyGoogleJavaFormatTask extends ParentGoogleJavaFormatTask { +@Inject +public ApplyGoogleJavaFormatTask(ProjectLayout layout) { +super(layout, "apply"); +} + +@TaskAction +public void formatSources(InputChanges inputChanges) throws IOException { +WorkQueue workQueue = getWorkQueue(); +List sourceFiles = getIncrementalBatch(inputChanges); +if (sourceFiles.isEmpty()) { +return; +} + +getLogger().info("Will apply formatting to {} source {} in this run.", +sourceFiles.size(), sourceFiles.size() == 1 ? "file" : "files"); + +for (var sourceFile : sourceFiles) { +workQueue.submit(ApplyGoogleJavaFormatAction.class, params -> { +params.getTargetFiles().setFrom(List.of(sourceFile)); +}); +} + +Path taskOutput = getOutputChangeListFile().get().getAsFile().toPath(); +Files.writeString( +taskOutput, + sourceFiles.stream().map(File::getPath).collect(Collectors.joining("\n"))); + +workQueue.await(); +} + +public static abstract class ApplyGoogleJavaFormatAction implements WorkAction { +public interface Parameters extends WorkParameters { +ConfigurableFileCollection getTargetFiles(); +} + +@Override +public void execute() { +var formatter = getFormatter(); + +for (File inputFile : getParameters().getTargetFiles().getFiles()) { +var inputPath = inputFile.toPath(); +try { +String input = Files.readString(inputPath); +String output = formatter.formatSourceAndFixImports(input); + +if (!input.equals(output)) { +Files.writeString(inputPath, output); +} Review Comment: @breskeby this is one example where I think gradle build model doesn't shine - for this task, the outputs and inputs overlap. It takes input files, formats them and writes them back to the same location. I know Ned Twigg (spotless) had similar problems. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] .editorconfig [lucene]
rmuir commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2993007936 If you don't mind, i'd like to get the CI piece working before making any big changes to this file. Then we can tone it down, fix it, etc, but changes to it will be validated rather than eyeballed. Just gimme a bit to get a PR up -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add Query for reranking KnnFloatVectorQuery with full-precision vectors [lucene]
vigyasharma commented on PR #14009: URL: https://github.com/apache/lucene/pull/14009#issuecomment-2992621644 Thanks for the explanations, @dungba88. I suppose the scenario you're trying to solve for, is when users want to change the matchset of a KnnVectorQuery using full-precision or other reranking. I'm open to this change if it's a valid requirement. My perception is that it should be solvable by plumbing vector query results into a rescorer, then combining top-K hits with other (lexical) hits. For e.g. is this also a problem for hybrid search in OpenSearch/Elasticsearch ? I suspect they might have independent queries for both with some way to combine results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add the ability to inverse a Sort [lucene]
msokolov commented on PR #14775: URL: https://github.com/apache/lucene/pull/14775#issuecomment-2992625771 Should we refuse to allow inverse of Sort.SCORE? or RELEVANCE? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] build: add eclint to verify editorconfig in CI [lucene]
rmuir commented on code in PR #14825: URL: https://github.com/apache/lucene/pull/14825#discussion_r2159848746 ## .github/actions/eclint/action.yml: ## @@ -0,0 +1,33 @@ +name: Install eclint +description: Installs eclint from cache, or builds it + +inputs: + eclint-version: +required: false +description: "eclint version to use" +default: "v0.5.1" + +runs: + using: "composite" + steps: + - name: Cache eclint binary +id: cache-eclint +uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3 +with: + path: ~/.local/bin/eclint + key: eclint-${{ inputs.eclint-version }}-${{ runner.os }}-${{ runner.arch }} Review Comment: May seem overkill to only save single-digit seconds, but the purpose is to avoid the network requests (similar to gradle-wrapper). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] build: fix incorrect gradle dependency [lucene]
dweiss merged PR #14826: URL: https://github.com/apache/lucene/pull/14826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] build: add eclint to verify editorconfig in CI [lucene]
rmuir opened a new pull request, #14825: URL: https://github.com/apache/lucene/pull/14825 You generally shouldn't need the check, as your editor should follow the editorconfig and not introduce the errors checked-for here. But it helps maintain the .editorconfig itself and ensure that files in the repository actually match whatever is configured there. If you want to enable the check locally for debugging: ``` GOBIN=${HOME}/.local/bin go install gitlab.com/greut/eclint/cmd/eclint@latest echo "lucene.tool.eclint=eclint" >> ~/.gradle/gradle.properties ``` The eclint action is not used, as it is not on the apache approved list. Getting the 3MB binary in CI takes 2s (cached) and 9s (uncached, using setup-go). The check itself runs fast: ``` real 0m3.647s user 0m6.702s sys 0m0.474s ``` @dweiss, I imagine you want to keep things a certain way, please feel free to push commits here. I just hack when it comes to the gradle side of things, but the actions part is decent I think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] .editorconfig [lucene]
dsmiley commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2992922097 I definitely disagree on this PR being rushed. You aren't a fan of it but that doesn't make it rushed. > Trying to apply such settings to ALL files of ALL types in repo is too much, you need to deal with wacky file types (e.g. make with tabs), windows file types that should have CRLF, generated files, test files, files that aren't valid utf-8, and the list goes on and on. I mostly sympathize. As you & Dawid suggested, I can remove the initial everything glob, and instead copy the rules there now to the files types it lists. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] .editorconfig [lucene]
rmuir commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2992994166 I'm actually a huge fan of having editorconfig, I just know it can be tricky if you want to get it right (e.g. not have random style changes in PRs). Similar process to when we moved to spotless, fixing all sources to conform, marking the git commits special so they don't mess up `git blame`, dealing with the long tail issues described above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Decrease TieredMergePolicy's default number of segments per tier to 8. [lucene]
jpountz commented on PR #14823: URL: https://github.com/apache/lucene/pull/14823#issuecomment-2993010598 Thanks @mikemccand ! I'll wait a few days before merging to give others a chance to take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] build: add eclint to verify editorconfig in CI [lucene]
github-actions[bot] commented on PR #14825: URL: https://github.com/apache/lucene/pull/14825#issuecomment-2993289814 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] build: add eclint to verify editorconfig in CI [lucene]
rmuir commented on code in PR #14825: URL: https://github.com/apache/lucene/pull/14825#discussion_r2159850350 ## .github/actions/eclint/action.yml: ## @@ -0,0 +1,33 @@ +name: Install eclint +description: Installs eclint from cache, or builds it + +inputs: + eclint-version: +required: false +description: "eclint version to use" +default: "v0.5.1" Review Comment: we could alternatively do go.mod/go.sum to let dependabot take care, but its a bit more awkward and I'm afraid of tons of noise from the indirect deps. Plus those files use tab indentation :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] build: fix incorrect gradle dependency [lucene]
github-actions[bot] commented on PR #14826: URL: https://github.com/apache/lucene/pull/14826#issuecomment-2993299954 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] build: fix incorrect gradle dependency [lucene]
rmuir opened a new pull request, #14826: URL: https://github.com/apache/lucene/pull/14826 applyAstGrep doesn't run in CI, because it is set to depend on "tidy" instead of "check" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add Query for reranking KnnFloatVectorQuery with full-precision vectors [lucene]
dungba88 commented on PR #14009: URL: https://github.com/apache/lucene/pull/14009#issuecomment-2993030559 > is when users want to change the matchset of a KnnVectorQuery using full-precision or other reranking Yes that's correct, @vigyasharma. We are using a hybrid search where KnnFloatVectorQuery and TermQuery (amongst others) are combined into a single BooleanQuery. Thus it is important to change the matchset of KnnFloatVectorQuery individually. For hybrid search in OpenSearch/Elastic Search, I'm wondering if @jmazanec15 and @benwtrent have any input. I'm having a feeling that it's quite common to combine lexical + KNN matching into a single BooleanQuery. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org