[
https://issues.apache.org/jira/browse/HADOOP-19737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18039105#comment-18039105
]
ASF GitHub Bot commented on HADOOP-19737:
-----------------------------------------
manika137 commented on code in PR #8056:
URL: https://github.com/apache/hadoop/pull/8056#discussion_r2537331384
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/WriteThreadPoolSizeManager.java:
##########
@@ -394,4 +481,99 @@ public void close() throws IOException {
}
}
}
+
+ /**
+ * Represents current statistics of the write thread pool and system.
+ */
+ public static class WriteThreadPoolStats {
+ private final int currentPoolSize; // matches CURRENT_POOL_SIZE metric
+ private final int maxPoolSize; // matches MAX_POOL_SIZE metric
+ private final int activeThreads; // matches ACTIVE_THREADS metric
+ private final double jvmCpuUtilization; // matches JVM_CPU_UTILIZATION
metric
+ private final double cpuUtilization; // matches CPU_UTILIZATION metric
+ private final long availableHeapGB; // matches MEMORY_UTILIZATION metric
+
+ /**
+ * Constructs a {@link WriteThreadPoolStats} instance with the given
thread pool
+ * and system utilization metrics.
+ *
+ * @param currentPoolSize the current number of threads in the pool.
+ * @param maxPoolSize the maximum allowed thread pool size.
+ * @param activeThreads the number of currently active threads.
+ * @param jvmCpuUtilization the JVM CPU utilization percentage.
+ * @param cpuUtilization the overall system CPU utilization
percentage.
+ * @param availableHeapGB the available heap memory in gigabytes.
+ */
+ public WriteThreadPoolStats(int currentPoolSize, int maxPoolSize,
+ int activeThreads, double jvmCpuUtilization, double cpuUtilization,
long availableHeapGB) {
+ this.currentPoolSize = currentPoolSize;
+ this.maxPoolSize = maxPoolSize;
+ this.activeThreads = activeThreads;
+ this.jvmCpuUtilization = jvmCpuUtilization;
+ this.cpuUtilization = cpuUtilization;
+ this.availableHeapGB = availableHeapGB;
+ }
+
+ /** @return the current number of threads in the pool. */
+ public int getCurrentPoolSize() {
+ return currentPoolSize;
+ }
+
+ /** @return the maximum allowed size of the thread pool. */
+ public int getMaxPoolSize() {
+ return maxPoolSize;
+ }
+
+ /** @return the number of threads currently executing tasks. */
+ public int getActiveThreads() {
+ return activeThreads;
+ }
+
+ /** @return the JVM process CPU utilization percentage. */
+ public double getJvmCpuUtilization() {
+ return jvmCpuUtilization;
+ }
+
+ /** @return the overall system CPU utilization percentage. */
+ public double getCpuUtilization() {
+ return cpuUtilization;
+ }
+
+ /** @return the available heap memory in gigabytes. */
+ public long getMemoryUtilization() {
+ return availableHeapGB;
+ }
+
+ @Override
+ public String toString() {
+ return String.format(
+ "currentPoolSize=%d, maxPoolSize=%d, activeThreads=%d,
jvmCpuUtilization=%.2f%%, cpuUtilization=%.2f%%, availableHeap=%dGB",
+ currentPoolSize, maxPoolSize, activeThreads, jvmCpuUtilization,
cpuUtilization * HUNDRED, availableHeapGB);
Review Comment:
we dont need to multiply jvmCpuUtilization also by 100?
Also, should we be multiplying with HUNDRED_D instead?
> ABFS: Add metrics to identify improvements with read and write aggressiveness
> -----------------------------------------------------------------------------
>
> Key: HADOOP-19737
> URL: https://issues.apache.org/jira/browse/HADOOP-19737
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.5.0, 3.4.2
> Reporter: Anmol Asrani
> Assignee: Anmol Asrani
> Priority: Major
> Labels: pull-request-available
>
> Introduces new performance metrics in the ABFS driver to monitor and evaluate
> the effectiveness of read and write aggressiveness tuning. These metrics help
> in understanding how thread pool behavior, CPU utilization, and heap
> availability impact overall I/O throughput and latency. By capturing detailed
> statistics such as active thread count, pool size, and system resource
> utilization, this enhancement enables data-driven analysis of optimizations
> made to improve ABFS read and write performance under varying workloads.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]