[ https://issues.apache.org/jira/browse/MRESOLVER-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tibor Digana updated MRESOLVER-93: ---------------------------------- Fix Version/s: 1.4.2 > PathRecordingDependencyVisitor to handle 3 cycles > ------------------------------------------------- > > Key: MRESOLVER-93 > URL: https://issues.apache.org/jira/browse/MRESOLVER-93 > Project: Maven Resolver > Issue Type: Improvement > Components: resolver > Affects Versions: 1.3.3, 1.4.0, 1.4.1 > Reporter: Tomo Suzuki > Priority: Major > Fix For: 1.4.2 > > Attachments: IMG_0234.jpg, IMG_0235.jpg, IMG_0236.jpg, IMG_0237.jpg, > IMG_0238.jpg, IMG_0240.jpg, IMG_0241.jpg, IMG_0242.jpg, IMG_0243.jpg, > IMG_0244.jpg, IMG_0245.jpg, IMG_0255.jpg, IMG_0256.jpg, IMG_0257.jpg, > IMG_0258.jpg, IMG_0259.jpg, IMG_0260.jpg, IMG_0261.jpg, IMG_0262.jpg, > IMG_0263.jpg, IMG_0264.jpg, IMG_0265.jpg, IMG_0266.jpg > > Time Spent: 10m > Remaining Estimate: 0h > > PathRecordingDependencyVisitor cannot handle dependency graphs that have 3 or > more cycles such as below: > > {code:java} > gid:a:1 (1) > +- gid:b:0 > | \- ^1 > +- gid:b:1 > | \- ^1 > \- gid:b:2 > \- ^1 > {code} > It fails with StackOverflowError or OutOfMemoryError. [Test > case|https://github.com/suztomo/maven-resolver/commit/31b24dfe240997861e27661a7540546fbe6e0dab]. > > h1. Solutions > I came up with three solutions. I pick solution #1 for simplicity. > h2. 1. Use "parents" to check the cycle, rather than visited set > This is the simplest. Checking array element member is usually discouraged > especially for large data set. The implementation should confirm the overhead > of this solution. > h2. 2. Use AbstractMapBag/Multiset for visited set > Creating a new class that extends AbstractMapBag and leverages > IdentityHashMap. Although this solution would be theoretically more efficient > than solution #1, I felt it's overkill to create a class just for this > solution. > {code:java} > AbstractMapBag(new IdentityHashMap<DependencyNode, > AbstractMapBag.MutableInteger>()){code} > > [https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/bag/AbstractMapBag.html] > > IdentityHashMap<DependencyNode, Integer>() would work as a multiset. > h2. 3. Call visitLeave only when visitEnter is true > The cause of this bug is > [DefaultDependencyNode|https://github.com/apache/maven-resolver/blob/47edcfe69c4e52ced4cb93d65b7348b5645cdd68/maven-resolver-api/src/main/java/org/eclipse/aether/graph/DefaultDependencyNode.java#L354] > calling visitLeave regardless of visitEnter result. > I'm not sure how many other visitors rely on visitLeave being called > regardless of visitEnter result. > h1. Illustration on why existing algorithm does not catch cycle > The following illustration is the node traversal for the test case above by > current algorithm. This illustration tracks the dependency node graph and the > "visited" set maintained by the visitor. > * visited set. An internal data structure in PathRecordingDependencyVisitor > to avoid cycle > ([link|https://github.com/apache/maven-resolver/blob/0c2373f6c66f20953b1a7e443ea1de8672d1b072/maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/visitor/PathRecordingDependencyVisitor.java#L45]). > * visitEnter(node): PathRecordingDependencyVisitor's function > ([link|https://github.com/apache/maven-resolver/blob/0c2373f6c66f20953b1a7e443ea1de8672d1b072/maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/visitor/PathRecordingDependencyVisitor.java#L100]). > When returning true, the node's children is traversed by the algorithm. This > function adds the node to visited set. > * visitLeave(node): PathRecordingDependencyVisitor's function > ([link|https://github.com/apache/maven-resolver/blob/0c2373f6c66f20953b1a7e443ea1de8672d1b072/maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/visitor/PathRecordingDependencyVisitor.java#L129]). > This function removes the node from visited set. > > The initial state starts with node "a" and visited set \{a}. > !IMG_0234.jpg|width=334,height=252! > First child of a is b0. Because visited does not contain, visitEnter(b0) > returns true, meaning that the algorithm traverses this b0's children next. > B0 is added to visited. > !IMG_0235.jpg|width=359,height=191! > B0's children is "a". Because visited set contains "a", visitEnter(a) returns > false. This means that the algorithm does not traverse this "a"'s children. A > is added to visited set (already it has). > !IMG_0236.jpg|width=438,height=197! > Now not traversing this "a"'s children, the algorithm calls visitLeave(a). > This removes "a" from visited set. > !IMG_0237.jpg|width=434,height=165! > B0's children are all traversed. the algorithm calls visitLeave(b0). This > removes "b0" from visited set. > !IMG_0238.jpg|width=459,height=197! > Now visited set is empty. > Next child of the root "a" is b1. B1 is not in visited set, thus > visitEnter(b1) returns true. This means the algorithm traverses the children > of this b1. > !IMG_0240.jpg|width=445,height=270! > B1's only child is a. "a" is not in visited set. visitEnter(a) returns true. > This means to traverse "a"'s children. > !IMG_0241.jpg|width=418,height=262! > A's first children is b0. b0 is not in visited set. visitEnter(b0) returns > true, meaning to traverse children of this b0. > !IMG_0242.jpg|width=422,height=208! > (img 0242) > The only child of b0 is "a". Visited set contains "a", and thus not > traversing its children. > !IMG_0243.jpg|width=491,height=191! > visitLeave(a) removes "a" from visited set. > !IMG_0244.jpg|width=481,height=189! > b0's children is all traversed. VisitLeave(b0) removes b0 from visited set. > !IMG_0245.jpg|width=498,height=182! > Next child of this "a" is b1. B1 is in visited set, and thus visitEnter(b1) > returns false. This node's children is not to be traversed. > !IMG_0255.jpg|width=545,height=245! > (img 0255) > visitLeave(b1) removes b1 from visited set. Now visited is emtpy. > !IMG_0256.jpg|width=528,height=294! > The last child of "a" is b2. VisitEnter(b2) returns true. It's children is to > be traversed. B2 is in visited set. > !IMG_0257.jpg|width=502,height=309! > B2's only child is "a". "a" is not in visited set, thus visitEnter(a) > returns true. The algorithm traverses this "a"'s children. > !IMG_0258.jpg|width=485,height=299! > (img 0258) > > (...omit...) > > IMG_0266 shows the step where I decided to give up. The algorithm does not > seem to stop. Indeed the test shows that. The path from the root to the > furthest a includes 5 "a" nodes. I concluded the visited set is not working > as expected to avoid cycle. > !IMG_0266.jpg|width=656,height=252! > > -- This message was sent by Atlassian Jira (v8.3.2#803003)