[ 
https://issues.apache.org/jira/browse/MNG-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiong Luyao updated MNG-7509:
-----------------------------
    Description: 
When maven try to resolve dependency relationship, it will create many 
instances of dependency / artifact, even the dependency/artifact content is 
totally same, but just in different pom models. It cost huge memory if there is 
a parent pom with dependencyManagement which managed a lot of dependencies, and 
this parent pom are implemented by many project libraries.

(libraries_count * managedDependency_count) dependency instances will be 
created. For example, if there are 3000 libraries, and all the library 
introduce same parent pom which managed 3000 dependencies version. There will 
be 3000*3000 =9,000,000 dependency instances will be created. But most of them 
are same, in fact, we only need one instance for each dependency in parent pom 
(3000 dependency instances).
 
I'm from eBay, and here is a real case in enterprise level project. We have 
about 3000 business domain libraries, there are dependency relationship between 
them. We need to build all libraries in one release to keep all the libraries 
in same release are based on same code. So we used a parent pom as a central 
management to manage all the version for a release, and introduced by those 
libraries.  As below picture, when the release start, it will calculate and 
start with the library which doesn't depend on others, then start the library 
which dependency libraries are already built. Keep this process until all 
libraries are built.

With current maven resolve logic, it costs huge memory in above ways to built 
libraries. And even the libraries have been released, if the project which 
contains a lot of above libraries, it also cost huge memory when building 
project.

So current now, we have to specify version in each library pom files instead of 
using parent pom
 
!image-2022-07-09-09-37-53-823.png|width=493,height=226!
 
Here is a thread dump when building a real project which depends on about 1000 
above libraries. The top 5 objects are all related to 
org.eclipse.aether.graph.Dependency.
!image-2022-07-09-09-38-26-354.png|width=510,height=199!
 
 

  was:
When maven try to resolve dependency relationship, it will create many 
instances of dependency / artifact, even the dependency/artifact content is 
totally same, but just in different pom models. It cost huge memory if there is 
a parent pom with dependencyManagement which managed a lot of dependencies, and 
this parent pom are implemented by many project libraries.

(libraries_count * managedDependency_count) dependency instances will be 
created. For example, if there are 3000 libraries, and all the library 
introduce same parent pom which managed 3000 dependencies version. There will 
be 3000*3000 =9,000,000 dependency instances will be created. But most of them 
are same, in fact, we only need one instance for each dependency in parent pom 
(3000 dependency instances).
 
Here is a real case in enterprise level project. We have about 3000 business 
domain libraries, there are dependency relationship between them. We need to 
build all libraries in one release to keep all the libraries in same release 
are based on same code. So we used a parent pom to manage all the version for a 
release, and introduced by those libraries.  As below picture, when the release 
start, it will calculate and start with the library which doesn't depend on 
others, then start the library which dependency libraries are already built. 
Keep this process until all libraries are built.

With current maven resolve logic, it costs huge memory in above ways to built 
libraries. And even the libraries have been released, if the project which 
contains a lot of above libraries, it also cost huge memory when building 
project.
 
!image-2022-07-09-09-37-53-823.png|width=493,height=226!
 
Here is a thread dump when building a real project which depends on about 1000 
above libraries. The top 5 objects are all related to 
org.eclipse.aether.graph.Dependency.
!image-2022-07-09-09-38-26-354.png|width=510,height=199!
 
 


> Huge memory cost when parent pom widely used in a big project for 
> dependencyManagement
> --------------------------------------------------------------------------------------
>
>                 Key: MNG-7509
>                 URL: https://issues.apache.org/jira/browse/MNG-7509
>             Project: Maven
>          Issue Type: Improvement
>          Components: Performance
>            Reporter: Xiong Luyao
>            Priority: Major
>         Attachments: image-2022-07-09-09-37-53-823.png, 
> image-2022-07-09-09-38-26-354.png, image-2022-07-09-10-27-12-668.png, 
> image-2022-07-09-10-27-56-437.png, image-2022-07-09-10-28-05-706.png, 
> image-2022-07-09-10-28-22-864.png, image-2022-07-09-10-28-35-341.png, 
> image-2022-07-09-10-28-40-612.png, image-2022-07-09-10-29-04-045.png, 
> image-2022-07-09-10-29-15-822.png, image-2022-07-09-10-29-21-991.png, 
> image-2022-07-09-10-29-46-216.png, image-2022-07-09-10-29-51-456.png
>
>
> When maven try to resolve dependency relationship, it will create many 
> instances of dependency / artifact, even the dependency/artifact content is 
> totally same, but just in different pom models. It cost huge memory if there 
> is a parent pom with dependencyManagement which managed a lot of 
> dependencies, and this parent pom are implemented by many project libraries.
> (libraries_count * managedDependency_count) dependency instances will be 
> created. For example, if there are 3000 libraries, and all the library 
> introduce same parent pom which managed 3000 dependencies version. There will 
> be 3000*3000 =9,000,000 dependency instances will be created. But most of 
> them are same, in fact, we only need one instance for each dependency in 
> parent pom (3000 dependency instances).
>  
> I'm from eBay, and here is a real case in enterprise level project. We have 
> about 3000 business domain libraries, there are dependency relationship 
> between them. We need to build all libraries in one release to keep all the 
> libraries in same release are based on same code. So we used a parent pom as 
> a central management to manage all the version for a release, and introduced 
> by those libraries.  As below picture, when the release start, it will 
> calculate and start with the library which doesn't depend on others, then 
> start the library which dependency libraries are already built. Keep this 
> process until all libraries are built.
> With current maven resolve logic, it costs huge memory in above ways to built 
> libraries. And even the libraries have been released, if the project which 
> contains a lot of above libraries, it also cost huge memory when building 
> project.
> So current now, we have to specify version in each library pom files instead 
> of using parent pom
>  
> !image-2022-07-09-09-37-53-823.png|width=493,height=226!
>  
> Here is a thread dump when building a real project which depends on about 
> 1000 above libraries. The top 5 objects are all related to 
> org.eclipse.aether.graph.Dependency.
> !image-2022-07-09-09-38-26-354.png|width=510,height=199!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to