[ 
https://issues.apache.org/jira/browse/MRESOLVER-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127769#comment-17127769
 ] 

Michael Osipov commented on MRESOLVER-114:
------------------------------------------

I have put some effort into it analyzed the issue and it is disappointing. 
First of all, I think we suffer from 
[this|https://stackoverflow.com/q/21739080/696632] problem and both PRs are 
wrong while the retry approach is, IMHO, even worse even if it/both "solves" 
the issue. I was not able to find logically the code spot in question, but 
there *is* a race condition. Unfortunately, Resolver has very little logging 
which would help us to narrow down the issue.

Now let me go into detail what I have figured out:
When we go the callstack up from #resolve() we land in public ... 
resolveArtifacts(...) which does the following:
{code:java}
try ( SyncContext syncContext = syncContextFactory.newInstance( session, false 
) )
{
        Collection<Artifact> artifacts = new ArrayList<>( requests.size() );
        for ( ArtifactRequest request : requests )
        {
                if ( request.getArtifact().getProperty( 
ArtifactProperties.LOCAL_PATH, null ) != null )
                {
                        continue;
                }
                artifacts.add( request.getArtifact() );
        }

        syncContext.acquire( artifacts, null );

        return resolve( session, requests );
}
{code}

Wait a minute, there is synchronization already, why does it still fail? 
Well...let's have a look at SyncContextFactory and SyncContext first:
SyncContextFactory shall create a reentrant read write lock for the given 
session. The boolean flag says read or write. SyncContext#acquire() shall 
perform the lock operation on the values (GAV keys) (!) of the given 
artifacts/metadata. This approach is used everywhere in the code where 
interaction with the filesytem is necessary. So #resolve() is *not* intended to 
be thread safe and it would be wrong to try to change it. Back to the question: 
When we now look for all impls of SyncContext we find only DefaultSyncContext 
which does nothing, i.e., no synchronization at all. Why do other operations 
not fail with SyncContext? I assume that most either don't use a degree a 
concurrency where it would fail on the same artifacts or it runs so fast that 
we don't see race conditions.
How can this now be fixed? We have two challenges here:
1. #newInstance() implies that syncContext is always new (other identity), but 
how shall #acquire() work multithreaded if no global locking on the artifacts 
can be maintained? Weird. The only solution which comes to my mind is to 
initialize a suitable lock implementation when #newInstance() is called in a 
synchronized block and put into RepositorySystemSession#getData().If this 
cannot be maintained we still will have the same race condition.
2. A suitable lock approach, if selfwritted, would require something like a 
concurrent hash map with weak refs onto muteces which represent the value 
(hashcode) of an artifact. On #acquire() the lock for that mutex is obtained 
and #lock() called. When the try block is left #release() is called 
automatically.
Alternatively, one needs to use a KeyedReadWriteLock or StripedReadWriteLock 
because we cannot maintain a lock per artifacts. This could end with thousands 
of locks.

I think even for someone who does concurrency on a daily basis, it would take 
some days to solve these points, for me whose concurrency skills are weak, 
weeks. I cannot provide this one.
It has also to be noted that DefaultArtifactResolver is not marked with 
@Singleton. I don't know whether there could be multiple instances used in 
Maven which means that only a global lock will work because synchronized will 
not have any effect or if SyncContextFactory is a singleton it could hold the 
global lock and pass it to the SyncContext objects.

See also: 
https://guava.dev/releases/19.0/api/docs/com/google/common/util/concurrent/Striped.html

A completely alternative approach is to write lock files for those objects 
instead of an in-memory lock. SyncContext says "Note that the level of actual 
synchronization is subject to the implementation and might range from OS-wide 
to none."

Unless someone can provide something very smart, I will do the following:

* Add FIXMEs to all invocations of SyncContextFactory#newInstance() to a new 
issue which describes this problem.
* Accept the compromise of PR #41.

Opinions?

> ArtifactNotFoundExceptions when building in parallel
> ----------------------------------------------------
>
>                 Key: MRESOLVER-114
>                 URL: https://issues.apache.org/jira/browse/MRESOLVER-114
>             Project: Maven Resolver
>          Issue Type: Bug
>          Components: resolver
>    Affects Versions: 1.4.2
>            Reporter: Rainer Reich
>            Priority: Major
>         Attachments: maven-debug-log.txt
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have a multi-module project with quite many modules and many dependencies 
> and observe pretty random {{ArtifactNotFoundExceptions}} when building in 
> parallel with an empty local repository.
> The "funny" thing is that Maven did in fact download the jar that it 
> complained about not finding.
> In this example Maven said it could not find 
> {{edu.tum.cs:cup-runtime:jar:11a}} (see stacktrace below)
>  But it also said: {{Downloaded from central-mirror: 
> [https://mirror.xy.com/repository/maven-all-mirror/edu/tum/cs/cup-runtime/11a/cup-runtime-11a.jar]}}.
> When looking into the local repository after the failed build we found the 
> cup-runtime.jar in the correct place with the correct checksum.
> We tried the following to "fix" the problem:
>  * build with and without the takari extensions ({{takari-local-repository}} 
> & {{takari-smart-builder}}) using {{--builder smart}}
>  * also build with {{io.takari.aether:aether-connector-okhttp}} extension
>  * only execute goal package instead of install
>  * build with these properties: {{-Daether.connector.basic.threads=1 
> -Daether.connector.resumeDownloads=false}}
> But nothing helped.
> Unfortunately it seems to be a race condition and we cannot reproduce it 
> consistently but it happens in about 1 out of 5 builds.
> {code:java}
> ...
> [2019-11-18T16:46:29.370Z] [INFO] Downloaded from central-mirror: 
> https://mirror.xy.com/repository/maven-all-mirror/edu/tum/cs/cup-runtime/11a/cup-runtime-11a.jar
>  (13 kB at 738 kB/s)
> ...
> [2019-11-18T16:46:30.894Z] [ERROR] Failed to execute goal on project xy: 
> Could not resolve dependencies for project xy: The following artifacts could 
> not be resolved: edu.tum.cs:cup-runtime:jar:11a, 
> org.checkerframework:checker-qual:jar:2.5.7, org.ow2.asm:asm:jar:7.2, 
> cglib:cglib:jar:3.3.0: Could not find artifact edu.tum.cs:cup-runtime:jar:11a 
> -> [Help 1]
> [2019-11-18T16:46:30.894Z] 
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal on project xy: Could not resolve dependencies for project xy: The 
> following artifacts could not be resolved: edu.tum.cs:cup-runtime:jar:11a, 
> org.checkerframework:checker-qual:jar:2.5.7, org.ow2.asm:asm:jar:7.2, 
> cglib:cglib:jar:3.3.0: Could not find artifact edu.tum.cs:cup-runtime:jar:11a
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies
>  (LifecycleDependencyResolver.java:269)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies
>  (LifecycleDependencyResolver.java:147)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved
>  (MojoExecutor.java:248)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:202)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl.buildProject 
> (SmartBuilderImpl.java:205)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run 
> (SmartBuilderImpl.java:77)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515)
> [2019-11-18T16:46:30.894Z]     at java.util.concurrent.FutureTask.run 
> (FutureTask.java:264)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker 
> (ThreadPoolExecutor.java:1128)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run 
> (ThreadPoolExecutor.java:628)
> [2019-11-18T16:46:30.894Z]     at java.lang.Thread.run (Thread.java:834)
> [2019-11-18T16:46:30.894Z] Caused by: 
> org.apache.maven.project.DependencyResolutionException: Could not resolve 
> dependencies for project xy: The following artifacts could not be resolved: 
> edu.tum.cs:cup-runtime:jar:11a, org.checkerframework:checker-qual:jar:2.5.7, 
> org.ow2.asm:asm:jar:7.2, cglib:cglib:jar:3.3.0: Could not find artifact 
> edu.tum.cs:cup-runtime:jar:11a
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.project.DefaultProjectDependenciesResolver.resolve 
> (DefaultProjectDependenciesResolver.java:209)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies
>  (LifecycleDependencyResolver.java:243)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies
>  (LifecycleDependencyResolver.java:147)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved
>  (MojoExecutor.java:248)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:202)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl.buildProject 
> (SmartBuilderImpl.java:205)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run 
> (SmartBuilderImpl.java:77)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515)
> [2019-11-18T16:46:30.894Z]     at java.util.concurrent.FutureTask.run 
> (FutureTask.java:264)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker 
> (ThreadPoolExecutor.java:1128)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run 
> (ThreadPoolExecutor.java:628)
> [2019-11-18T16:46:30.894Z]     at java.lang.Thread.run (Thread.java:834)
> [2019-11-18T16:46:30.894Z] Caused by: 
> org.eclipse.aether.resolution.DependencyResolutionException: The following 
> artifacts could not be resolved: edu.tum.cs:cup-runtime:jar:11a, 
> org.checkerframework:checker-qual:jar:2.5.7, org.ow2.asm:asm:jar:7.2, 
> cglib:cglib:jar:3.3.0: Could not find artifact edu.tum.cs:cup-runtime:jar:11a
> [2019-11-18T16:46:30.894Z]     at 
> org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies 
> (DefaultRepositorySystem.java:357)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.project.DefaultProjectDependenciesResolver.resolve 
> (DefaultProjectDependenciesResolver.java:202)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies
>  (LifecycleDependencyResolver.java:243)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies
>  (LifecycleDependencyResolver.java:147)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved
>  (MojoExecutor.java:248)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:202)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl.buildProject 
> (SmartBuilderImpl.java:205)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run 
> (SmartBuilderImpl.java:77)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515)
> [2019-11-18T16:46:30.894Z]     at java.util.concurrent.FutureTask.run 
> (FutureTask.java:264)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker 
> (ThreadPoolExecutor.java:1128)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run 
> (ThreadPoolExecutor.java:628)
> [2019-11-18T16:46:30.894Z]     at java.lang.Thread.run (Thread.java:834)
> [2019-11-18T16:46:30.894Z] Caused by: 
> org.eclipse.aether.resolution.ArtifactResolutionException: The following 
> artifacts could not be resolved: edu.tum.cs:cup-runtime:jar:11a, 
> org.checkerframework:checker-qual:jar:2.5.7, org.ow2.asm:asm:jar:7.2, 
> cglib:cglib:jar:3.3.0: Could not find artifact edu.tum.cs:cup-runtime:jar:11a
> [2019-11-18T16:46:30.894Z]     at 
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve 
> (DefaultArtifactResolver.java:424)
> [2019-11-18T16:46:30.894Z]     at 
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts 
> (DefaultArtifactResolver.java:229)
> [2019-11-18T16:46:30.894Z]     at 
> org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies 
> (DefaultRepositorySystem.java:340)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.project.DefaultProjectDependenciesResolver.resolve 
> (DefaultProjectDependenciesResolver.java:202)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies
>  (LifecycleDependencyResolver.java:243)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies
>  (LifecycleDependencyResolver.java:147)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved
>  (MojoExecutor.java:248)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:202)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl.buildProject 
> (SmartBuilderImpl.java:205)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run 
> (SmartBuilderImpl.java:77)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515)
> [2019-11-18T16:46:30.894Z]     at java.util.concurrent.FutureTask.run 
> (FutureTask.java:264)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker 
> (ThreadPoolExecutor.java:1128)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run 
> (ThreadPoolExecutor.java:628)
> [2019-11-18T16:46:30.894Z]     at java.lang.Thread.run (Thread.java:834)
> [2019-11-18T16:46:30.894Z] Caused by: 
> org.eclipse.aether.transfer.ArtifactNotFoundException: Could not find 
> artifact edu.tum.cs:cup-runtime:jar:11a
> [2019-11-18T16:46:30.894Z]     at 
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve 
> (DefaultArtifactResolver.java:414)
> [2019-11-18T16:46:30.894Z]     at 
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts 
> (DefaultArtifactResolver.java:229)
> [2019-11-18T16:46:30.894Z]     at 
> org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies 
> (DefaultRepositorySystem.java:340)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.project.DefaultProjectDependenciesResolver.resolve 
> (DefaultProjectDependenciesResolver.java:202)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies
>  (LifecycleDependencyResolver.java:243)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies
>  (LifecycleDependencyResolver.java:147)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved
>  (MojoExecutor.java:248)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:202)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> [2019-11-18T16:46:30.894Z]     at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl.buildProject 
> (SmartBuilderImpl.java:205)
> [2019-11-18T16:46:30.894Z]     at 
> io.takari.maven.builder.smart.SmartBuilderImpl$ProjectBuildTask.run 
> (SmartBuilderImpl.java:77)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:515)
> [2019-11-18T16:46:30.894Z]     at java.util.concurrent.FutureTask.run 
> (FutureTask.java:264)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker 
> (ThreadPoolExecutor.java:1128)
> [2019-11-18T16:46:30.894Z]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run 
> (ThreadPoolExecutor.java:628)
> [2019-11-18T16:46:30.894Z]     at java.lang.Thread.run (Thread.java:834)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to