Hello

TL;DR: https://github.com/grgrzybek/tracking-maven-extension

I'd like to share some proof of concept I made. It all started with a
question "why I'm getting log4j:log4j:1.2.12" in my local Maven repository
when building trivial project with fresh local repo?

I knew it's possible to `grep -r --include=*.pom 1.2.12` the poms that
declare old log4j, but I needed something better.

In short words - I managed to persist the information available in
org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
stack.
I wrote a Maven extension that can be put into $MAVEN_HOME/lib/ext or used
with "-Dmaven.ext.class.path" which does two things:

   1. adds org.eclipse.aether.RepositoryListener component that writes some
   information when a dependency is FIRST downloaded from remote repository
   2. adds org.eclipse.aether.impl.DependencyCollector component (extension
   of org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector)
   that writes some information when a dependency is resolved against local
   repository when it's already there (where no download is needed)

In the first case, I write something like this:

~~~
Downloaded artifact log4j:log4j:pom::1.2.12 (repository: central (
https://repo.maven.apache.org/maven2, default, releases))
   -> commons-logging:commons-logging:jar:1.1 (compile) (context: plugin)
     -> commons-digester:commons-digester:jar:1.8 (compile) (context:
plugin)
       -> org.apache.velocity:velocity-tools:jar:2.0 (compile) (context:
plugin)
         -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1 (compile)
(context: plugin)
           -> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 ()
(context: plugin)
  Reading descriptor for artifact log4j:log4j:jar::1.2.12 (context: plugin)
(scope: ?) (repository: central (https://repo.maven.apache.org/maven2,
default, releases))
    Transitive dependencies collection for
org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 ()
      Resolution of plugin
org.apache.maven.plugins:maven-site-plugin:3.11.0 (org.apache:apache:25)
~~~
Downloaded artifact log4j:log4j:jar::1.2.12 (repository: central (
https://repo.maven.apache.org/maven2, default, releases))
  Resolution of plugin com.mycila:license-maven-plugin:3.0
(org.apache.camel:camel-buildtools:3.17.0-SNAPSHOT)

I simply write some information from available
org.eclipse.aether.RepositoryEvent and event's
org.eclipse.aether.RequestTrace.

More interesting information is written in 2nd case. Because I wanted to
track ALL attempts to resolve log4j:log4j:1.2.12 (and any other
dependency), I needed some structure. And I decided this:

   - every dependency directory (where e.g., _remote.repositories is
   written along with the jar/pom/sha1/md5/...) gets ".tracking" directory
   - in ".tracking" directory I write files with names of this pattern:
   "groupId_artifactId_type_classifier_version.dep", e.g.,
   org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
   - each such file contains a _reverse dependency tree_ that shows my why
   given dependency was resolved.

For example, in
~/.m2/repository/log4j/log4j/1.2.12/.tracking/org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
(the path itself already contains information that
org.apache.maven.plugins:maven-dependency-plugin:3.1.2 depends (directly or
indirectly) in log4j:logj4:1.2.12.
The content of this file is:

log4j:log4j:pom:1.2.12
 -> commons-logging:commons-logging:jar:1.1 (compile) (context: plugin)
   -> commons-digester:commons-digester:jar:1.8 (compile) (context: plugin)
     -> org.apache.velocity:velocity-tools:jar:2.0 (compile) (context:
plugin)
       -> org.apache.maven.doxia:doxia-site-renderer:jar:1.7.4 (compile)
(context: plugin)
         -> org.apache.maven.reporting:maven-reporting-impl:jar:3.0.0
(compile) (context: plugin)
           -> org.apache.maven.plugins:maven-dependency-plugin:jar:3.1.2 ()
(context: plugin)

It's kind of obvious - dependency-plugin through maven-reporint-impl,
through doxia, velocity, commons-digester and commons-logging "depends" on
malicious log4j:1.2.12 library every security scanner screams about.

Since I wrote this extension, I keep it in my @MAVEN_HOME/lib/ext and build
everything in my work. Now I know why my
~/.m2/repository/org/codehaus/plexus/plexus-utils/ directory contains 57
different versions of plexus-utils for example. for example why 1.0.4 from
2005?

org.codehaus.plexus:plexus-utils:pom:1.0.4
 -> org.codehaus.plexus:plexus-container-default:jar:1.0-alpha-9-stable-1
(compile) (context: plugin)
   -> org.codehaus.plexus:plexus-velocity:jar:1.2 (compile) (context:
plugin)
     -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1 (compile)
(context: plugin)
       -> org.apache.maven.plugins:maven-javadoc-plugin:jar:3.3.2 ()
(context: plugin)

Why Guava 10.0.1?

com.google.guava:guava:pom:10.0.1
 -> org.eclipse.sisu:org.eclipse.sisu.plexus:jar:0.0.0.M5 (compile)
(context: plugin)
   -> org.apache.maven:maven-plugin-api:jar:3.1.1 (compile) (context:
plugin)
     -> org.apache.maven:maven-core:jar:3.1.1 (compile) (context: plugin)
       -> org.apache.maven.shared:maven-common-artifact-filters:jar:3.2.0
(runtime) (context: plugin)
         -> org.springframework.boot:spring-boot-maven-plugin:jar:2.5.12 ()
(context: plugin)

yes - Spring Boot 2.5.12...

Why Log4j 2.10.0?

org.apache.logging.log4j:log4j-api:pom:2.10.0
 -> org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0 (compile) (context:
project)
   ->
org.springframework.boot:spring-boot-starter-logging:jar:2.0.5.RELEASE
(compile) (context: project)
     -> org.springframework.boot:spring-boot-starter:jar:2.0.5.RELEASE
(compile) (context: project)
       ->
org.springframework.boot:spring-boot-starter-web:jar:2.0.5.RELEASE
(compile) (context: project)
         -> org.keycloak:keycloak-spring-boot-2-adapter:jar:17.0.1
(context: project)

(see - this time the context is "project", not "plugin").

And so on and so on.

What is my motivation with this email? I don't know yet - ideally I'd like
to have this ".tracking" information created together with
"_remote.repositories" and "*.lastUpdated" metadata by Maven Resolver. It
could be optional of course (the overhead is really minimal - 1 more minute
when building Camel 3 - 1 hour instead of 59 minutes).

The only problem I had is that I had to fork/shade
org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector class
because I had to manipulate
org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
stack around the call to
org.jboss.fuse.mvnplugins.tracker.TrackingDependencyCollector#processDependency().
Besides this, normal plexus/sisu components are used.

The repository is https://github.com/grgrzybek/tracking-maven-extension and
I'd be happy to see some comments about this ;)

kind regards
Grzegorz Grzybek

Reply via email to