On 6/11/25 08:54, Michael Bien wrote: > Hi, > > 3/4 stack traces you posted contained the constructor frame with the > Objects.hash() call. > > Hash code over so many collections can scale badly once the collections grow: > https://github.com/apache/maven-resolver/blob/145588be6d5b2c0111b5f367787d4c9347c326da/maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/manager/AbstractDependencyManager.java#L118-L126 > > It will grab every key/value pair and call hashCode() on it to build the > final value. > > If the dependency manager isn't put into Sets/Maps the hashcode could be > potentially > calculated lazily (which would ideally not compute it at all and avoid the > problem). > But I haven't looked at the code any deeper than that - maybe this isn't > possible. thats essentially what the 1.9.x impl did: https://github.com/apache/maven-resolver/blob/8535d09660cb68eb55d8e75228d5d35eaae00c88/maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/manager/TransitiveDependencyManager.java#L232-L237
Updating AbstractDependencyManager alone won't change much though since the whole graph is computing the hash codes eagerly now - so it would be called anyway just a little bit later. If there is a public project which hits this bottleneck I could take a look and play with some ideas to try to make this scale better. (I don't really want to spam this thread too much) best regards, michael > > (as sidenote: managedLocalPaths is missing in equals) > > > > I do like using async-profiler* in situations like this since it is easy to > use and > the flame graphs often provide a good overview showing relative CPU usage. > But in this > case you likely already found the bottleneck in the stacktraces you took. > > usage would be: > > sudo sysctl kernel.kptr_restrict=0 && sudo sysctl kernel.perf_event_paranoid=1 > > export > MAVEN_OPTS=-agentpath:/path/to/async-profiler-4.0-linux-x64/lib/libasyncProfiler.so=start,event=cpu,file=maven.html > > run mvn command. Check maven.html. > > > best regards, > michael > > > * https://github.com/async-profiler/async-profiler > > > On 6/10/25 18:10, Sergey Chernov wrote: >> I was able to run the Maven build (~900 modules) with the profiler. >> Used `Apache Maven 4.0.0-rc-3` and the simplest possible command `mvn >> initialize` which demonstrates reproducible degradation, the build is using >> 6 worker threads. >> Maven 3 runs this command in 5 sec, Maven 4 - 6 minutes (no mistake). >> YourKit highlighted as the most hot >> `org.eclipse.aether.internal.impl.DefaultRepositorySystem.collectDependencies(RepositorySystemSession, >> CollectRequest)` method. >> I collected several stacktraces during the build and found this common part >> of the stacktrace >> ``` >> at >> org.eclipse.aether.util.graph.manager.AbstractDependencyManager.deriveChildManager >> at >> org.eclipse.aether.internal.impl.collect.bf.BfDependencyCollector.doRecurse(BfDependencyCollector.java:361) >> at >> org.eclipse.aether.internal.impl.collect.bf.BfDependencyCollector.processDependency(BfDependencyCollector.java:320) >> at >> org.eclipse.aether.internal.impl.collect.bf.BfDependencyCollector.doCollectDependencies(BfDependencyCollector.java:202) >> at >> org.eclipse.aether.internal.impl.collect.DependencyCollectorDelegate.collectDependencies(DependencyCollectorDelegate.java:222) >> at >> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.collectDependencies(DefaultDependencyCollector.java:79) >> at >> org.eclipse.aether.internal.impl.DefaultRepositorySystem.collectDependencies(DefaultRepositorySystem.java:241) >> at >> org.apache.maven.project.DefaultProjectDependenciesResolver.resolve(DefaultProjectDependenciesResolver.java:157) >> at >> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies(LifecycleDependencyResolver.java:260) >> at >> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectArtifacts(LifecycleDependencyResolver.java:208) >> - locked <xxxxxxxxxxxx> (a >> org.apache.maven.project.artifact.DefaultProjectArtifactsCache$CacheKey) >> at >> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies(LifecycleDependencyResolver.java:128) >> at >> org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved(MojoExecutor.java:368) >> at >> org.apache.maven.lifecycle.internal.MojoExecutor.doExecute(MojoExecutor.java:307) >> at >> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:214) >> at >> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:179) >> at >> org.apache.maven.lifecycle.internal.MojoExecutor$1.run(MojoExecutor.java:168) >> at >> org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute(DefaultMojosExecutionStrategy.java:39) >> at >> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:165) >> at >> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:110) >> at >> org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder.lambda$createBuildCallable$1(MultiThreadedBuilder.java:191) >> at >> org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$$Lambda$1189/0x0000000800526240.call(Unknown >> Source) >> at java.util.concurrent.FutureTask.run(java.base@17.0.14 >> /FutureTask.java:264) >> at java.util.concurrent.Executors$RunnableAdapter.call(java.base@17.0.14 >> /Executors.java:539) >> at java.util.concurrent.FutureTask.run(java.base@17.0.14 >> /FutureTask.java:264) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.14 >> /ThreadPoolExecutor.java:1136) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.14 >> /ThreadPoolExecutor.java:635) >> at java.lang.Thread.run(java.base@17.0.14/Thread.java:840) >> ``` >> As you see, the top method in the snippet is `deriveChildManager` and in >> the captured stack traces there are different combinations of stacktraces >> above it. But the below part is always the same. >> In this gist you can find 4 real stacktraces of the same Maven process >> captured during the build when it was already slow >> https://gist.github.com/seregamorph/bace2d5087ba195279583704bb41a5c4 >> >> Things I'd like to focus attention is: it looks like it's degrading >> smoothly, maybe the degradation depends on the depth of dependency tree, >> maybe on the number of transitive dependencies (correlating values). But in >> the very beginning (first 400 modules) it goes quite fast. It does not fail >> with OOM, so probably the problem is around some nested iterations. >> >> Let me know what else I can do (including patching maven to have more >> sensors) as I'm not sure how to provide a reproducible project model (it is >> private). >> I tried the gradle performance testing tool >> https://gist.github.com/melix/c09f6d8d27b0005302cc3317c9c9be05 , but the >> project that it generates does not reproduce the problem (probably because >> the dependency tree of the huge multi-module project is almost flat). >> >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org For additional commands, e-mail: dev-h...@maven.apache.org