Hello

I've finally found some time to check your PR#176 Tamás... Here are my
comments and answers (also to previous messages).

https://github.com/apache/maven-resolver/pull/176
>
> So here is some implementation "demo" (that could be made into extension
> point), as explained in Draft PR description.
> BUT, also as written in PR, am getting a feeling that doing this is
> "dangerous", and a simple callback with whole collected graph would be
> better....
>

I've built Maven 3.8.5.1 (special local version) with maven-resolver
1.8.0.1 (special local version) with PR#176 included. I was easily able to
switch between DF and BF collectors (-Daether.collector.impl) and I could
find the "path" to "top" (or "current") dependency:
 - BF:
org.eclipse.aether.internal.impl.collect.bf.DependencyProcessingContext#parents
 - DF: org.eclipse.aether.internal.impl.collect.df.NodeStack#nodes

- 1st: Personally, from a Resolver perspective, I'd just provide an API
> (basically the author extending resolver should implement) and make it
> simple to "click in" (Sisu component discovery).
> - 2nd: resolver IMHO should not provide any out of the box component
> implementation at all
>

I agree - there should be no additional processing without an explicit
extension (custom Sisu/Plexus component)

So 1st would provide a "stable" extension point for users who would like to
> "integrate" with resolver at this point (like you did), but it could become
> possible using simply this new API, instead the hoops and loops your code
> was forced to do (as resolver is quite "closed" in this respect).
>

Indeed - I had to shade several resolver classes simply to make them public
(with protected methods). With DF/BF resolvers, it'd be even more important
to have some clear contract.

As for 2nd point, while I do like your idea of "decorating" local
> repository, I'd try a bit different route: I'd integrate this
> https://github.com/lambdazen/bitsy that makes possible to use Apache
> Tinkerpop's Gremlin queries to ask about the built graph for example...
>

At first glance, it looks like an overkill ;) But I didn't check enough
probably...

So, after playing a bit with 1.8.0[.1] of the BF/DF resolvers and your #176
PR, I see that example
org.eclipse.aether.internal.impl.collect.DependencyCollectorDelegate#dependencyCollected()
extension point you've introduced is a bit too early for my use case...
It's invoked during dependency collection, but I think it'd be better to
simply use "full path" when there's actual download (or resolution from
local repository).

My whole need to extend resolver was to collect the path from initial to
final dependency, so the stack is available when it's needed.

Initially I thought that org.eclipse.aether.RequestTrace should be the
thing I could use to get current dependency path, but I found it's not
possible.

Maybe your DependencyCollectorDelegate#dependencyCollected() could simply
"expose" the List<DependencyNode> path somewhere? Maybe in Maven session?
as attribute?

kind regards
Grzegorz Grzybek

śr., 11 maj 2022 o 18:40 Tamás Cservenák <[email protected]> napisał(a):

> Howdy,
>
> https://github.com/apache/maven-resolver/pull/176
>
> So here is some implementation "demo" (that could be made into extension
> point), as explained in Draft PR description.
> BUT, also as written in PR, am getting a feeling that doing this is
> "dangerous", and a simple callback with whole collected graph would be
> better....
>
>
> WDYT?
>
> Tamas
>
> On Mon, May 2, 2022 at 4:18 PM Tamás Cservenák <[email protected]>
> wrote:
>
> > Howdy,
> >
> > just a few short answers:
> > - 1st: Personally, from a Resolver perspective, I'd just provide an API
> > (basically the author extending resolver should implement) and make it
> > simple to "click in" (Sisu component discovery).
> > - 2nd: resolver IMHO should not provide any out of the box component
> > implementation at all
> >
> > So 1st would provide a "stable" extension point for users who would like
> > to "integrate" with resolver at this point (like you did), but it could
> > become possible using simply this new API, instead the hoops and loops
> your
> > code was forced to do (as resolver is quite "closed" in this respect).
> >
> > As for 2nd point, while I do like your idea of "decorating" local
> > repository, I'd try a bit different route: I'd integrate this
> > https://github.com/lambdazen/bitsy that makes possible to use Apache
> > Tinkerpop's Gremlin queries to ask about the built graph for example...
> >
> > And one big remark: the collector is the "hottest point" in resolver
> (heap
> > and cpu wise), so ANY "new API" implementation should be aware, that each
> > "lost" millisecond directly affects resolver collection speed, but I
> think
> > for "research kind" of stuff, of just "recording the process result"
> should
> > fit in just fine. I don't see this as a "standard" feature of Maven, but
> > who knows? :)
> >
> > Just my 5 cents...
> >
> > HTH
> > Tamas
> >
> > On Mon, May 2, 2022 at 4:09 PM Grzegorz Grzybek <[email protected]>
> > wrote:
> >
> >> Thank you Tamás for checking my experiment
> >>
> >> I'm just finishing my work before tomorrow's national holiday, but will
> >> read your information more carefully soon.
> >>
> >> Whether it's DFS or BFS, as long as there's tracking from initial to
> >> ultimate dependency, it's enough. DFS sounds more "natural" here
> though. I
> >> didn't check the CollectResult class yet - is it created per dependency
> or
> >> for the entire project?
> >>
> >> And yes - I didn't check multithreading, as in normal scenario (just
> `mvn
> >> clean install`) I didn't observe concurrency issues accessing the stack.
> >> Mind that I know a bit about maven "components", but there are
> definitely
> >> few missing things in my understanding.
> >>
> >> Checking your output, I see there are two aspects of this potential
> >> enhancement to the resolver:
> >>  - 1st - how to effectively collect the "reverse dependency tree" in
> >> context of DFS/BFS/multithreading
> >>  - 2nd - how to write the information
> >>
> >> 2nd aspect could include:
> >>  - whether there should be ".tracking" for each GAV directory in local
> >> repo
> >> (tracking for the purpose of entire local repository)
> >>  - maybe there should be configurable output location for single report
> of
> >> a build? (tracking for the purpose of single project)
> >>  - which format to use (human consumable or machine readable?)
> >>
> >> For now I've used resolver 1.6.3 from Maven 3.8.5, but I'll look at
> `main`
> >> branch too.
> >>
> >> kind regards
> >> Grzegorz Grzybek
> >>
> >>
> >> pon., 2 maj 2022 o 15:57 Tamás Cservenák <[email protected]>
> >> napisał(a):
> >>
> >> > What I missed to mention: in my case the trees in the gist are about
> >> > "resolving maven-core 3.5.8", but I guess you figured it out from the
> >> > tree....
> >> >
> >> > T
> >> >
> >> > On Mon, May 2, 2022 at 3:55 PM Tamás Cservenák <[email protected]>
> >> > wrote:
> >> >
> >> > > Howdy,
> >> > >
> >> > > I did some experiment, that (partially re-using your code to dump
> the
> >> rev
> >> > > tree) produces this output:
> >> > > https://gist.github.com/cstamas/598a3266f943984442c00df30520294f
> >> > >
> >> > > (note: 1.8.0 resolver has two collector implementations: original
> >> > > Depth-First and new Breadth-First called DF and BF respectively)
> >> > >
> >> > > The code is not pushed yet anywhere, but I plan to make an API for
> >> this,
> >> > > and as you can see, it works
> >> > > for both implementations of collectors. Also, I hook ONLY into
> >> collector,
> >> > > as that's the place where the graph
> >> > > is being built, but this is logically equivalent to your "More
> >> > interesting
> >> > > ... 2nd case".
> >> > >
> >> > > Will ping once again when I have the changes....
> >> > >
> >> > > Thanks
> >> > > Tamas
> >> > >
> >> > > On Thu, Apr 28, 2022 at 9:01 PM Tamás Cservenák <
> [email protected]>
> >> > > wrote:
> >> > >
> >> > >> Howdy,
> >> > >>
> >> > >> This is very cool, I was actually tinkering on very similar issues
> in
> >> > >> resolver coming from totally different angles.
> >> > >>
> >> > >> And yes, the resolver collector is not quite "extension" friendly,
> >> but
> >> > we
> >> > >> will make it right.
> >> > >> Just FYI, that in the latest resolver (1.8.0) there are actually
> two
> >> > >> implementations: depth-first (original) and depth-first.
> >> > >>
> >> > >> By looking at your code: collection is most critical regarding
> >> > >> performance and memory in the resolver, so "hooking" into it (like
> >> > sending
> >> > >> events per each step) might not be the best, but still, what kind
> of
> >> > >> extension points would you envision in the collector?
> >> > >>
> >> > >> For example, to achieve what you want, it would be completely
> enough
> >> to
> >> > >> receive the final CollectResult (the full graph), no?
> >> > >> As -- from a resolver perspective -- that would be simplest,
> >> especially
> >> > >> that now we have two collector implementations...
> >> > >>
> >> > >> Also, in case of multi threading, your shared stack would not cut,
> >> would
> >> > >> it?
> >> > >>
> >> > >> I personally was also looking into these, especially after some of
> >> the
> >> > >> latest additions to resolver in 1.8.0 and current master....
> >> > >>
> >> > >>
> >> > >> Thanks
> >> > >> T
> >> > >>
> >> > >>
> >> > >> On Thu, Apr 28, 2022 at 12:45 PM Grzegorz Grzybek <
> >> [email protected]
> >> > >
> >> > >> wrote:
> >> > >>
> >> > >>> Hello
> >> > >>>
> >> > >>> TL;DR: https://github.com/grgrzybek/tracking-maven-extension
> >> > >>>
> >> > >>> I'd like to share some proof of concept I made. It all started
> with
> >> a
> >> > >>> question "why I'm getting log4j:log4j:1.2.12" in my local Maven
> >> > >>> repository
> >> > >>> when building trivial project with fresh local repo?
> >> > >>>
> >> > >>> I knew it's possible to `grep -r --include=*.pom 1.2.12` the poms
> >> that
> >> > >>> declare old log4j, but I needed something better.
> >> > >>>
> >> > >>> In short words - I managed to persist the information available in
> >> > >>>
> >> > >>>
> >> >
> >>
> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
> >> > >>> stack.
> >> > >>> I wrote a Maven extension that can be put into $MAVEN_HOME/lib/ext
> >> or
> >> > >>> used
> >> > >>> with "-Dmaven.ext.class.path" which does two things:
> >> > >>>
> >> > >>>    1. adds org.eclipse.aether.RepositoryListener component that
> >> writes
> >> > >>> some
> >> > >>>    information when a dependency is FIRST downloaded from remote
> >> > >>> repository
> >> > >>>    2. adds org.eclipse.aether.impl.DependencyCollector component
> >> > >>> (extension
> >> > >>>    of
> >> > >>>
> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector)
> >> > >>>    that writes some information when a dependency is resolved
> >> against
> >> > >>> local
> >> > >>>    repository when it's already there (where no download is
> needed)
> >> > >>>
> >> > >>> In the first case, I write something like this:
> >> > >>>
> >> > >>> ~~~
> >> > >>> Downloaded artifact log4j:log4j:pom::1.2.12 (repository: central (
> >> > >>> https://repo.maven.apache.org/maven2, default, releases))
> >> > >>>    -> commons-logging:commons-logging:jar:1.1 (compile) (context:
> >> > plugin)
> >> > >>>      -> commons-digester:commons-digester:jar:1.8 (compile)
> >> (context:
> >> > >>> plugin)
> >> > >>>        -> org.apache.velocity:velocity-tools:jar:2.0 (compile)
> >> > (context:
> >> > >>> plugin)
> >> > >>>          -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1
> >> > >>> (compile)
> >> > >>> (context: plugin)
> >> > >>>            ->
> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0
> >> ()
> >> > >>> (context: plugin)
> >> > >>>   Reading descriptor for artifact log4j:log4j:jar::1.2.12
> (context:
> >> > >>> plugin)
> >> > >>> (scope: ?) (repository: central (
> >> https://repo.maven.apache.org/maven2,
> >> > >>> default, releases))
> >> > >>>     Transitive dependencies collection for
> >> > >>> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 ()
> >> > >>>       Resolution of plugin
> >> > >>> org.apache.maven.plugins:maven-site-plugin:3.11.0
> >> > (org.apache:apache:25)
> >> > >>> ~~~
> >> > >>> Downloaded artifact log4j:log4j:jar::1.2.12 (repository: central (
> >> > >>> https://repo.maven.apache.org/maven2, default, releases))
> >> > >>>   Resolution of plugin com.mycila:license-maven-plugin:3.0
> >> > >>> (org.apache.camel:camel-buildtools:3.17.0-SNAPSHOT)
> >> > >>>
> >> > >>> I simply write some information from available
> >> > >>> org.eclipse.aether.RepositoryEvent and event's
> >> > >>> org.eclipse.aether.RequestTrace.
> >> > >>>
> >> > >>> More interesting information is written in 2nd case. Because I
> >> wanted
> >> > to
> >> > >>> track ALL attempts to resolve log4j:log4j:1.2.12 (and any other
> >> > >>> dependency), I needed some structure. And I decided this:
> >> > >>>
> >> > >>>    - every dependency directory (where e.g., _remote.repositories
> is
> >> > >>>    written along with the jar/pom/sha1/md5/...) gets ".tracking"
> >> > >>> directory
> >> > >>>    - in ".tracking" directory I write files with names of this
> >> pattern:
> >> > >>>    "groupId_artifactId_type_classifier_version.dep", e.g.,
> >> > >>>    org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
> >> > >>>    - each such file contains a _reverse dependency tree_ that
> shows
> >> my
> >> > >>> why
> >> > >>>    given dependency was resolved.
> >> > >>>
> >> > >>> For example, in
> >> > >>>
> >> > >>>
> >> >
> >>
> ~/.m2/repository/log4j/log4j/1.2.12/.tracking/org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
> >> > >>> (the path itself already contains information that
> >> > >>> org.apache.maven.plugins:maven-dependency-plugin:3.1.2 depends
> >> > (directly
> >> > >>> or
> >> > >>> indirectly) in log4j:logj4:1.2.12.
> >> > >>> The content of this file is:
> >> > >>>
> >> > >>> log4j:log4j:pom:1.2.12
> >> > >>>  -> commons-logging:commons-logging:jar:1.1 (compile) (context:
> >> plugin)
> >> > >>>    -> commons-digester:commons-digester:jar:1.8 (compile)
> (context:
> >> > >>> plugin)
> >> > >>>      -> org.apache.velocity:velocity-tools:jar:2.0 (compile)
> >> (context:
> >> > >>> plugin)
> >> > >>>        -> org.apache.maven.doxia:doxia-site-renderer:jar:1.7.4
> >> > (compile)
> >> > >>> (context: plugin)
> >> > >>>          ->
> >> org.apache.maven.reporting:maven-reporting-impl:jar:3.0.0
> >> > >>> (compile) (context: plugin)
> >> > >>>            ->
> >> > org.apache.maven.plugins:maven-dependency-plugin:jar:3.1.2
> >> > >>> ()
> >> > >>> (context: plugin)
> >> > >>>
> >> > >>> It's kind of obvious - dependency-plugin through
> >> maven-reporint-impl,
> >> > >>> through doxia, velocity, commons-digester and commons-logging
> >> "depends"
> >> > >>> on
> >> > >>> malicious log4j:1.2.12 library every security scanner screams
> about.
> >> > >>>
> >> > >>> Since I wrote this extension, I keep it in my @MAVEN_HOME/lib/ext
> >> and
> >> > >>> build
> >> > >>> everything in my work. Now I know why my
> >> > >>> ~/.m2/repository/org/codehaus/plexus/plexus-utils/ directory
> >> contains
> >> > 57
> >> > >>> different versions of plexus-utils for example. for example why
> >> 1.0.4
> >> > >>> from
> >> > >>> 2005?
> >> > >>>
> >> > >>> org.codehaus.plexus:plexus-utils:pom:1.0.4
> >> > >>>  ->
> >> > org.codehaus.plexus:plexus-container-default:jar:1.0-alpha-9-stable-1
> >> > >>> (compile) (context: plugin)
> >> > >>>    -> org.codehaus.plexus:plexus-velocity:jar:1.2 (compile)
> >> (context:
> >> > >>> plugin)
> >> > >>>      -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1
> >> (compile)
> >> > >>> (context: plugin)
> >> > >>>        -> org.apache.maven.plugins:maven-javadoc-plugin:jar:3.3.2
> ()
> >> > >>> (context: plugin)
> >> > >>>
> >> > >>> Why Guava 10.0.1?
> >> > >>>
> >> > >>> com.google.guava:guava:pom:10.0.1
> >> > >>>  -> org.eclipse.sisu:org.eclipse.sisu.plexus:jar:0.0.0.M5
> (compile)
> >> > >>> (context: plugin)
> >> > >>>    -> org.apache.maven:maven-plugin-api:jar:3.1.1 (compile)
> >> (context:
> >> > >>> plugin)
> >> > >>>      -> org.apache.maven:maven-core:jar:3.1.1 (compile) (context:
> >> > plugin)
> >> > >>>        ->
> >> > org.apache.maven.shared:maven-common-artifact-filters:jar:3.2.0
> >> > >>> (runtime) (context: plugin)
> >> > >>>          ->
> >> > org.springframework.boot:spring-boot-maven-plugin:jar:2.5.12
> >> > >>> ()
> >> > >>> (context: plugin)
> >> > >>>
> >> > >>> yes - Spring Boot 2.5.12...
> >> > >>>
> >> > >>> Why Log4j 2.10.0?
> >> > >>>
> >> > >>> org.apache.logging.log4j:log4j-api:pom:2.10.0
> >> > >>>  -> org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0 (compile)
> >> > >>> (context:
> >> > >>> project)
> >> > >>>    ->
> >> > >>>
> >> org.springframework.boot:spring-boot-starter-logging:jar:2.0.5.RELEASE
> >> > >>> (compile) (context: project)
> >> > >>>      ->
> >> org.springframework.boot:spring-boot-starter:jar:2.0.5.RELEASE
> >> > >>> (compile) (context: project)
> >> > >>>        ->
> >> > >>> org.springframework.boot:spring-boot-starter-web:jar:2.0.5.RELEASE
> >> > >>> (compile) (context: project)
> >> > >>>          -> org.keycloak:keycloak-spring-boot-2-adapter:jar:17.0.1
> >> > >>> (context: project)
> >> > >>>
> >> > >>> (see - this time the context is "project", not "plugin").
> >> > >>>
> >> > >>> And so on and so on.
> >> > >>>
> >> > >>> What is my motivation with this email? I don't know yet - ideally
> >> I'd
> >> > >>> like
> >> > >>> to have this ".tracking" information created together with
> >> > >>> "_remote.repositories" and "*.lastUpdated" metadata by Maven
> >> Resolver.
> >> > It
> >> > >>> could be optional of course (the overhead is really minimal - 1
> more
> >> > >>> minute
> >> > >>> when building Camel 3 - 1 hour instead of 59 minutes).
> >> > >>>
> >> > >>> The only problem I had is that I had to fork/shade
> >> > >>>
> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector
> >> > class
> >> > >>> because I had to manipulate
> >> > >>>
> >> > >>>
> >> >
> >>
> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
> >> > >>> stack around the call to
> >> > >>>
> >> > >>>
> >> >
> >>
> org.jboss.fuse.mvnplugins.tracker.TrackingDependencyCollector#processDependency().
> >> > >>> Besides this, normal plexus/sisu components are used.
> >> > >>>
> >> > >>> The repository is
> >> > https://github.com/grgrzybek/tracking-maven-extension
> >> > >>> and
> >> > >>> I'd be happy to see some comments about this ;)
> >> > >>>
> >> > >>> kind regards
> >> > >>> Grzegorz Grzybek
> >> > >>>
> >> > >>
> >> >
> >>
> >
>

Reply via email to