[jira] [Commented] (HADOOP-19737) ABFS: Add metrics to identify improvements with read and write aggressiveness

ASF GitHub Bot (Jira) Tue, 18 Nov 2025 02:20:04 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-19737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18039105#comment-18039105
 ]


ASF GitHub Bot commented on HADOOP-19737:
-----------------------------------------

manika137 commented on code in PR #8056:
URL: https://github.com/apache/hadoop/pull/8056#discussion_r2537331384


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/WriteThreadPoolSizeManager.java:
##########
@@ -394,4 +481,99 @@ public void close() throws IOException {
       }
     }
   }
+
+  /**
+   * Represents current statistics of the write thread pool and system.
+   */
+  public static class WriteThreadPoolStats {
+    private final int currentPoolSize;   // matches CURRENT_POOL_SIZE metric
+    private final int maxPoolSize;       // matches MAX_POOL_SIZE metric
+    private final int activeThreads;     // matches ACTIVE_THREADS metric
+    private final double jvmCpuUtilization; // matches JVM_CPU_UTILIZATION 
metric
+    private final double cpuUtilization; // matches CPU_UTILIZATION metric
+    private final long availableHeapGB;  // matches MEMORY_UTILIZATION metric
+
+    /**
+     * Constructs a {@link WriteThreadPoolStats} instance with the given 
thread pool
+     * and system utilization metrics.
+     *
+     * @param currentPoolSize     the current number of threads in the pool.
+     * @param maxPoolSize         the maximum allowed thread pool size.
+     * @param activeThreads       the number of currently active threads.
+     * @param jvmCpuUtilization   the JVM CPU utilization percentage.
+     * @param cpuUtilization      the overall system CPU utilization 
percentage.
+     * @param availableHeapGB     the available heap memory in gigabytes.
+     */
+    public WriteThreadPoolStats(int currentPoolSize, int maxPoolSize,
+        int activeThreads, double jvmCpuUtilization, double cpuUtilization, 
long availableHeapGB) {
+      this.currentPoolSize = currentPoolSize;
+      this.maxPoolSize = maxPoolSize;
+      this.activeThreads = activeThreads;
+      this.jvmCpuUtilization = jvmCpuUtilization;
+      this.cpuUtilization = cpuUtilization;
+      this.availableHeapGB = availableHeapGB;
+    }
+
+    /** @return the current number of threads in the pool. */
+    public int getCurrentPoolSize() {
+      return currentPoolSize;
+    }
+
+    /** @return the maximum allowed size of the thread pool. */
+    public int getMaxPoolSize() {
+      return maxPoolSize;
+    }
+
+    /** @return the number of threads currently executing tasks. */
+    public int getActiveThreads() {
+      return activeThreads;
+    }
+
+    /** @return the JVM process CPU utilization percentage. */
+    public double getJvmCpuUtilization() {
+      return jvmCpuUtilization;
+    }
+
+    /** @return the overall system CPU utilization percentage. */
+    public double getCpuUtilization() {
+      return cpuUtilization;
+    }
+
+    /** @return the available heap memory in gigabytes. */
+    public long getMemoryUtilization() {
+      return availableHeapGB;
+    }
+
+    @Override
+    public String toString() {
+      return String.format(
+          "currentPoolSize=%d, maxPoolSize=%d, activeThreads=%d, 
jvmCpuUtilization=%.2f%%, cpuUtilization=%.2f%%, availableHeap=%dGB",
+          currentPoolSize, maxPoolSize, activeThreads, jvmCpuUtilization,  
cpuUtilization * HUNDRED, availableHeapGB);

Review Comment:
   we dont need to multiply jvmCpuUtilization also by 100?
   Also, should we be multiplying with HUNDRED_D instead?





> ABFS: Add metrics to identify improvements with read and write aggressiveness
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-19737
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19737
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.5.0, 3.4.2
>            Reporter: Anmol Asrani
>            Assignee: Anmol Asrani
>            Priority: Major
>              Labels: pull-request-available
>
> Introduces new performance metrics in the ABFS driver to monitor and evaluate 
> the effectiveness of read and write aggressiveness tuning. These metrics help 
> in understanding how thread pool behavior, CPU utilization, and heap 
> availability impact overall I/O throughput and latency. By capturing detailed 
> statistics such as active thread count, pool size, and system resource 
> utilization, this enhancement enables data-driven analysis of optimizations 
> made to improve ABFS read and write performance under varying workloads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-19737) ABFS: Add metrics to identify improvements with read and write aggressiveness

Reply via email to