[ 
https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Divij Vaidya updated KAFKA-16052:
---------------------------------
    Description: 
*Problem*
Our test suite is failing with frequent OOM. Discussion in the mailing list is 
here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] 


*Setup*
To find the source of leaks, I ran the :core:test build target with a single 
thread (see below on how to do it) and attached a profiler to it. This Jira 
tracks the list of action items identified from the analysis.

How to run tests using a single thread:
{code:java}
diff --git a/build.gradle b/build.gradle
index f7abbf4f0b..81df03f1ee 100644
--- a/build.gradle
+++ b/build.gradle
@@ -74,9 +74,8 @@ ext {
       "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
     )-  maxTestForks = project.hasProperty('maxParallelForks') ? 
maxParallelForks.toInteger() : Runtime.runtime.availableProcessors()
-  maxScalacThreads = project.hasProperty('maxScalacThreads') ? 
maxScalacThreads.toInteger() :
-      Math.min(Runtime.runtime.availableProcessors(), 8)
+  maxTestForks = 1
+  maxScalacThreads = 1
   userIgnoreFailures = project.hasProperty('ignoreFailures') ? ignoreFailures 
: false   userMaxTestRetries = project.hasProperty('maxTestRetries') ? 
maxTestRetries.toInteger() : 0
diff --git a/gradle.properties b/gradle.properties
index 4880248cac..ee4b6e3bc1 100644
--- a/gradle.properties
+++ b/gradle.properties
@@ -30,4 +30,4 @@ scalaVersion=2.13.12
 swaggerVersion=2.2.8
 task=build
 org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC
-org.gradle.parallel=true
+org.gradle.parallel=false {code}

*Result of experiment*
This is how the heap memory utilized looks like, starting from tens of MB to 
ending with 1.5GB (with spikes of 2GB) of heap being used as the test executes. 
Note that the total number of threads also increases but it does not correlate 
with sharp increase in heap memory usage. The heap dump is available at 
[https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0]
 

!Screenshot 2023-12-27 at 14.22.21.png!

  was:
Our test suite is failing with frequent OOM. Discussion in the mailing list is 
here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] 

To find the source of leaks, I ran the :core:test build target with a single 
thread (see below on how to do it) and attached a profiler to it. This Jira 
tracks the list of action items identified from the analysis.

How to run tests using a single thread:
{code:java}
diff --git a/build.gradle b/build.gradle
index f7abbf4f0b..81df03f1ee 100644
--- a/build.gradle
+++ b/build.gradle
@@ -74,9 +74,8 @@ ext {
       "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
     )-  maxTestForks = project.hasProperty('maxParallelForks') ? 
maxParallelForks.toInteger() : Runtime.runtime.availableProcessors()
-  maxScalacThreads = project.hasProperty('maxScalacThreads') ? 
maxScalacThreads.toInteger() :
-      Math.min(Runtime.runtime.availableProcessors(), 8)
+  maxTestForks = 1
+  maxScalacThreads = 1
   userIgnoreFailures = project.hasProperty('ignoreFailures') ? ignoreFailures 
: false   userMaxTestRetries = project.hasProperty('maxTestRetries') ? 
maxTestRetries.toInteger() : 0
diff --git a/gradle.properties b/gradle.properties
index 4880248cac..ee4b6e3bc1 100644
--- a/gradle.properties
+++ b/gradle.properties
@@ -30,4 +30,4 @@ scalaVersion=2.13.12
 swaggerVersion=2.2.8
 task=build
 org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC
-org.gradle.parallel=true
+org.gradle.parallel=false {code}
This is how the heap mempry utilized looks like, starting from tens of MB to 
ending with 1.5GB (with spikes of 2GB) of heap being used as the test executes. 
Note that the total number of threads also increases but it does not correlate 
with sharp increase in heap memory usage. The heap dump is available at 
[https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0]
 

!Screenshot 2023-12-27 at 14.22.21.png!


> OOM in Kafka test suite
> -----------------------
>
>                 Key: KAFKA-16052
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16052
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.7.0
>            Reporter: Divij Vaidya
>            Priority: Major
>         Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot 
> 2023-12-27 at 14.22.21.png
>
>
> *Problem*
> Our test suite is failing with frequent OOM. Discussion in the mailing list 
> is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] 
> *Setup*
> To find the source of leaks, I ran the :core:test build target with a single 
> thread (see below on how to do it) and attached a profiler to it. This Jira 
> tracks the list of action items identified from the analysis.
> How to run tests using a single thread:
> {code:java}
> diff --git a/build.gradle b/build.gradle
> index f7abbf4f0b..81df03f1ee 100644
> --- a/build.gradle
> +++ b/build.gradle
> @@ -74,9 +74,8 @@ ext {
>        "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
>      )-  maxTestForks = project.hasProperty('maxParallelForks') ? 
> maxParallelForks.toInteger() : Runtime.runtime.availableProcessors()
> -  maxScalacThreads = project.hasProperty('maxScalacThreads') ? 
> maxScalacThreads.toInteger() :
> -      Math.min(Runtime.runtime.availableProcessors(), 8)
> +  maxTestForks = 1
> +  maxScalacThreads = 1
>    userIgnoreFailures = project.hasProperty('ignoreFailures') ? 
> ignoreFailures : false   userMaxTestRetries = 
> project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0
> diff --git a/gradle.properties b/gradle.properties
> index 4880248cac..ee4b6e3bc1 100644
> --- a/gradle.properties
> +++ b/gradle.properties
> @@ -30,4 +30,4 @@ scalaVersion=2.13.12
>  swaggerVersion=2.2.8
>  task=build
>  org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC
> -org.gradle.parallel=true
> +org.gradle.parallel=false {code}
> *Result of experiment*
> This is how the heap memory utilized looks like, starting from tens of MB to 
> ending with 1.5GB (with spikes of 2GB) of heap being used as the test 
> executes. Note that the total number of threads also increases but it does 
> not correlate with sharp increase in heap memory usage. The heap dump is 
> available at 
> [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0]
>  
> !Screenshot 2023-12-27 at 14.22.21.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to