[ https://issues.apache.org/jira/browse/MRESOLVER-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127769#comment-17127769 ]
Michael Osipov commented on MRESOLVER-114: ------------------------------------------ I have put some effort into it analyzed the issue and it is disappointing. First of all, I think we suffer from [this|https://stackoverflow.com/q/21739080/696632] problem and both PRs are wrong while the retry approach is, IMHO, even worse even if it/both "solves" the issue. I was not able to find logically the code spot in question, but there *is* a race condition. Unfortunately, Resolver has very little logging which would help us to narrow down the issue. Now let me go into detail what I have figured out: When we go the callstack up from #resolve() we land in public ... resolveArtifacts(...) which does the following: {code:java} try ( SyncContext syncContext = syncContextFactory.newInstance( session, false ) ) { Collection<Artifact> artifacts = new ArrayList<>( requests.size() ); for ( ArtifactRequest request : requests ) { if ( request.getArtifact().getProperty( ArtifactProperties.LOCAL_PATH, null ) != null ) { continue; } artifacts.add( request.getArtifact() ); } syncContext.acquire( artifacts, null ); return resolve( session, requests ); } {code} Wait a minute, there is synchronization already, why does it still fail? Well...let's have a look at SyncContextFactory and SyncContext first: SyncContextFactory shall create a reentrant read write lock for the given session. The boolean flag says read or write. SyncContext#acquire() shall perform the lock operation on the values (GAV keys) (!) of the given artifacts/metadata. This approach is used everywhere in the code where interaction with the filesytem is necessary. So #resolve() is *not* intended to be thread safe and it would be wrong to try to change it. Back to the question: When we now look for all impls of SyncContext we find only DefaultSyncContext which does nothing, i.e., no synchronization at all. Why do other operations not fail with SyncContext? I assume that most either don't use a degree a concurrency where it would fail on the same artifacts or it runs so fast that we don't see race conditions. How can this now be fixed? We have two challenges here: 1. #newInstance() implies that syncContext is always new (other identity), but how shall #acquire() work multithreaded if no global locking on the artifacts can be maintained? Weird. The only solution which comes to my mind is to initialize a suitable lock implementation when #newInstance() is called in a synchronized block and put into RepositorySystemSession#getData().If this cannot be maintained we still will have the same race condition. 2. A suitable lock approach, if selfwritted, would require something like a concurrent hash map with weak refs onto muteces which represent the value (hashcode) of an artifact. On #acquire() the lock for that mutex is obtained and #lock() called. When the try block is left #release() is called automatically. Alternatively, one needs to use a KeyedReadWriteLock or StripedReadWriteLock because we cannot maintain a lock per artifacts. This could end with thousands of locks. I think even for someone who does concurrency on a daily basis, it would take some days to solve these points, for me whose concurrency skills are weak, weeks. I cannot provide this one. It has also to be noted that DefaultArtifactResolver is not marked with @Singleton. I don't know whether there could be multiple instances used in Maven which means that only a global lock will work because synchronized will not have any effect or if SyncContextFactory is a singleton it could hold the global lock and pass it to the SyncContext objects. See also: https://guava.dev/releases/19.0/api/docs/com/google/common/util/concurrent/Striped.html A completely alternative approach is to write lock files for those objects instead of an in-memory lock. SyncContext says "Note that the level of actual synchronization is subject to the implementation and might range from OS-wide to none." Unless someone can provide something very smart, I will do the following: * Add FIXMEs to all invocations of SyncContextFactory#newInstance() to a new issue which describes this problem. * Accept the compromise of PR #41. Opinions? > ArtifactNotFoundExceptions when building in parallel > ---------------------------------------------------- > > Key: MRESOLVER-114 > URL: https://issues.apache.org/jira/browse/MRESOLVER-114 > Project: Maven Resolver > Issue Type: Bug > Components: resolver > Affects Versions: 1.4.2 > Reporter: Rainer Reich > Priority: Major > Attachments: maven-debug-log.txt > > Time Spent: 20m > Remaining Estimate: 0h > > We have a multi-module project with quite many modules and many dependencies > and observe pretty random {{ArtifactNotFoundExceptions}} when building in > parallel with an empty local repository. > The "funny" thing is that Maven did in fact download the jar that it > complained about not finding. > In this example Maven said it could not find > {{edu.tum.cs:cup-runtime:jar:11a}} (see stacktrace below) > But it also said: {{Downloaded from central-mirror: > [https://mirror.xy.com/repository/maven-all-mirror/edu/tum/cs/cup-runtime/11a/cup-runtime-11a.jar]}}. > When looking into the local repository after the failed build we found the > cup-runtime.jar in the correct place with the correct checksum. > We tried the following to "fix" the problem: > * build with and without the takari extensions ({{takari-local-repository}} > & {{takari-smart-builder}}) using {{--builder smart}} > * also build with {{io.takari.aether:aether-connector-okhttp}} extension > * only execute goal package instead of install > * build with these properties: {{-Daether.connector.basic.threads=1 > -Daether.connector.resumeDownloads=false}} > But nothing helped. > Unfortunately it seems to be a race condition and we cannot reproduce it > consistently but it happens in about 1 out of 5 builds. > {code:java} > ... > [2019-11-18T16:46:29.370Z] [INFO] Downloaded from central-mirror: > https://mirror.xy.com/repository/maven-all-mirror/edu/tum/cs/cup-runtime/11a/cup-runtime-11a.jar > (13 kB at 738 kB/s) > ... > [2019-11-18T16:46:30.894Z] [ERROR] Failed to execute goal on project xy: > Could not resolve dependencies for project xy: The following artifacts could > not be resolved: edu.tum.cs:cup-runtime:jar:11a, > org.checkerframework:checker-qual:jar:2.5.7, org.ow2.asm:asm:jar:7.2, > cglib:cglib:jar:3.3.0: Could not find artifact edu.tum.cs:cup-runtime:jar:11a > -> [Help 1] > [2019-11-18T16:46:30.894Z] > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal on project xy: Could not resolve dependencies for project xy: The > following artifacts could not be resolved: edu.tum.cs:cup-runtime:jar:11a, > org.checkerframework:checker-qual:jar:2.5.7, org.ow2.asm:asm:jar:7.2, > cglib:cglib:jar:3.3.0: Could not find artifact edu.tum.cs:cup-runtime:jar:11a > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies > (LifecycleDependencyResolver.java:269) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies > (LifecycleDependencyResolver.java:147) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved > (MojoExecutor.java:248) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:202) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:156) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:148) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject > (LifecycleModuleBuilder.java:117) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl.buildProject > (SmartBuilderImpl.java:205) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run > (SmartBuilderImpl.java:77) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515) > [2019-11-18T16:46:30.894Z] at java.util.concurrent.FutureTask.run > (FutureTask.java:264) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor.runWorker > (ThreadPoolExecutor.java:1128) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor$Worker.run > (ThreadPoolExecutor.java:628) > [2019-11-18T16:46:30.894Z] at java.lang.Thread.run (Thread.java:834) > [2019-11-18T16:46:30.894Z] Caused by: > org.apache.maven.project.DependencyResolutionException: Could not resolve > dependencies for project xy: The following artifacts could not be resolved: > edu.tum.cs:cup-runtime:jar:11a, org.checkerframework:checker-qual:jar:2.5.7, > org.ow2.asm:asm:jar:7.2, cglib:cglib:jar:3.3.0: Could not find artifact > edu.tum.cs:cup-runtime:jar:11a > [2019-11-18T16:46:30.894Z] at > org.apache.maven.project.DefaultProjectDependenciesResolver.resolve > (DefaultProjectDependenciesResolver.java:209) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies > (LifecycleDependencyResolver.java:243) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies > (LifecycleDependencyResolver.java:147) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved > (MojoExecutor.java:248) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:202) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:156) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:148) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject > (LifecycleModuleBuilder.java:117) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl.buildProject > (SmartBuilderImpl.java:205) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run > (SmartBuilderImpl.java:77) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515) > [2019-11-18T16:46:30.894Z] at java.util.concurrent.FutureTask.run > (FutureTask.java:264) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor.runWorker > (ThreadPoolExecutor.java:1128) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor$Worker.run > (ThreadPoolExecutor.java:628) > [2019-11-18T16:46:30.894Z] at java.lang.Thread.run (Thread.java:834) > [2019-11-18T16:46:30.894Z] Caused by: > org.eclipse.aether.resolution.DependencyResolutionException: The following > artifacts could not be resolved: edu.tum.cs:cup-runtime:jar:11a, > org.checkerframework:checker-qual:jar:2.5.7, org.ow2.asm:asm:jar:7.2, > cglib:cglib:jar:3.3.0: Could not find artifact edu.tum.cs:cup-runtime:jar:11a > [2019-11-18T16:46:30.894Z] at > org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies > (DefaultRepositorySystem.java:357) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.project.DefaultProjectDependenciesResolver.resolve > (DefaultProjectDependenciesResolver.java:202) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies > (LifecycleDependencyResolver.java:243) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies > (LifecycleDependencyResolver.java:147) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved > (MojoExecutor.java:248) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:202) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:156) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:148) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject > (LifecycleModuleBuilder.java:117) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl.buildProject > (SmartBuilderImpl.java:205) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run > (SmartBuilderImpl.java:77) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515) > [2019-11-18T16:46:30.894Z] at java.util.concurrent.FutureTask.run > (FutureTask.java:264) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor.runWorker > (ThreadPoolExecutor.java:1128) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor$Worker.run > (ThreadPoolExecutor.java:628) > [2019-11-18T16:46:30.894Z] at java.lang.Thread.run (Thread.java:834) > [2019-11-18T16:46:30.894Z] Caused by: > org.eclipse.aether.resolution.ArtifactResolutionException: The following > artifacts could not be resolved: edu.tum.cs:cup-runtime:jar:11a, > org.checkerframework:checker-qual:jar:2.5.7, org.ow2.asm:asm:jar:7.2, > cglib:cglib:jar:3.3.0: Could not find artifact edu.tum.cs:cup-runtime:jar:11a > [2019-11-18T16:46:30.894Z] at > org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve > (DefaultArtifactResolver.java:424) > [2019-11-18T16:46:30.894Z] at > org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts > (DefaultArtifactResolver.java:229) > [2019-11-18T16:46:30.894Z] at > org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies > (DefaultRepositorySystem.java:340) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.project.DefaultProjectDependenciesResolver.resolve > (DefaultProjectDependenciesResolver.java:202) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies > (LifecycleDependencyResolver.java:243) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies > (LifecycleDependencyResolver.java:147) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved > (MojoExecutor.java:248) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:202) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:156) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:148) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject > (LifecycleModuleBuilder.java:117) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl.buildProject > (SmartBuilderImpl.java:205) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run > (SmartBuilderImpl.java:77) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515) > [2019-11-18T16:46:30.894Z] at java.util.concurrent.FutureTask.run > (FutureTask.java:264) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor.runWorker > (ThreadPoolExecutor.java:1128) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor$Worker.run > (ThreadPoolExecutor.java:628) > [2019-11-18T16:46:30.894Z] at java.lang.Thread.run (Thread.java:834) > [2019-11-18T16:46:30.894Z] Caused by: > org.eclipse.aether.transfer.ArtifactNotFoundException: Could not find > artifact edu.tum.cs:cup-runtime:jar:11a > [2019-11-18T16:46:30.894Z] at > org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve > (DefaultArtifactResolver.java:414) > [2019-11-18T16:46:30.894Z] at > org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts > (DefaultArtifactResolver.java:229) > [2019-11-18T16:46:30.894Z] at > org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies > (DefaultRepositorySystem.java:340) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.project.DefaultProjectDependenciesResolver.resolve > (DefaultProjectDependenciesResolver.java:202) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies > (LifecycleDependencyResolver.java:243) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies > (LifecycleDependencyResolver.java:147) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved > (MojoExecutor.java:248) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:202) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:156) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:148) > [2019-11-18T16:46:30.894Z] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject > (LifecycleModuleBuilder.java:117) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl.buildProject > (SmartBuilderImpl.java:205) > [2019-11-18T16:46:30.894Z] at > io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run > (SmartBuilderImpl.java:77) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515) > [2019-11-18T16:46:30.894Z] at java.util.concurrent.FutureTask.run > (FutureTask.java:264) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor.runWorker > (ThreadPoolExecutor.java:1128) > [2019-11-18T16:46:30.894Z] at > java.util.concurrent.ThreadPoolExecutor$Worker.run > (ThreadPoolExecutor.java:628) > [2019-11-18T16:46:30.894Z] at java.lang.Thread.run (Thread.java:834) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)