anujmodi2021 commented on code in PR #8056:
URL: https://github.com/apache/hadoop/pull/8056#discussion_r2549241207
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/enums/AbfsWriteThreadPoolMetricsEnum.java:
##########
@@ -0,0 +1,95 @@
+/**
Review Comment:
Do we need 2 searate classes for these?
All the metrics seems to be common in both.
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ReadBufferManagerV2.java:
##########
@@ -104,14 +109,39 @@ public final class ReadBufferManagerV2 extends
ReadBufferManager {
private static AtomicBoolean isConfigured = new AtomicBoolean(false);
+ /* Metrics collector for monitoring the performance of the ABFS read thread
pool. */
+ private final AbfsReadThreadPoolMetrics readThreadPoolMetrics;
+
+ /* Last recorded CPU time used for computing CPU utilization deltas. */
+ private static long lastCpuTime = 0;
+
+ /* Last recorded system time used for utilization calculations. */
+ private static long lastTime = 0;
+
+ private final AbfsClient abfsClient;
+ /* Tracks the last scale direction applied, or empty if none. */
+ private volatile String lastScaleDirection = EMPTY_STRING;
+ /* Maximum CPU utilization observed during the monitoring interval. */
+ private volatile double maxCpuUtilization = 0.0;
+
/**
- * Private constructor to prevent instantiation as this needs to be
singleton.
+ * Initializes a new instance of {@code ReadBufferManagerV2} for the given
ABFS client.
+ *
+ * @param abfsClient the {@link AbfsClient} used for managing read
operations.
*/
- private ReadBufferManagerV2() {
+ private ReadBufferManagerV2(AbfsClient abfsClient) {
+ this.abfsClient = abfsClient;
Review Comment:
Why do we need client here?
IMO we should not pass client to RBM.
Any client related work should happen in AbfsInputStream how its happening
today.
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java:
##########
@@ -1346,7 +1346,13 @@ public AbfsRestOperation read(final String path,
final AbfsUriQueryBuilder abfsUriQueryBuilder =
createDefaultUriQueryBuilder();
String sasTokenForReuse = appendSASTokenToQuery(path,
SASTokenProvider.READ_OPERATION,
abfsUriQueryBuilder, cachedSasToken);
-
+ // Retrieve the read thread pool metrics from the ABFS counters.
+ AbfsReadThreadPoolMetrics metrics = getAbfsCounters()
Review Comment:
Common code in Blob and Dfs client Can be a common method in base class.
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ReadBufferManagerV2.java:
##########
@@ -104,14 +109,39 @@ public final class ReadBufferManagerV2 extends
ReadBufferManager {
private static AtomicBoolean isConfigured = new AtomicBoolean(false);
+ /* Metrics collector for monitoring the performance of the ABFS read thread
pool. */
+ private final AbfsReadThreadPoolMetrics readThreadPoolMetrics;
+
+ /* Last recorded CPU time used for computing CPU utilization deltas. */
+ private static long lastCpuTime = 0;
+
+ /* Last recorded system time used for utilization calculations. */
+ private static long lastTime = 0;
+
+ private final AbfsClient abfsClient;
+ /* Tracks the last scale direction applied, or empty if none. */
+ private volatile String lastScaleDirection = EMPTY_STRING;
+ /* Maximum CPU utilization observed during the monitoring interval. */
+ private volatile double maxCpuUtilization = 0.0;
+
/**
- * Private constructor to prevent instantiation as this needs to be
singleton.
+ * Initializes a new instance of {@code ReadBufferManagerV2} for the given
ABFS client.
Review Comment:
All the clients in JVM ae supposed to use same ReadBufferManager. This
comment seems wrong
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AzureDFSIngressHandler.java:
##########
@@ -117,6 +117,13 @@ protected AbfsRestOperation remoteWrite(AbfsBlock
blockToUpload,
AppendRequestParameters reqParams,
TracingContext tracingContext) throws IOException {
TracingContext tracingContextAppend = new TracingContext(tracingContext);
+ // Fetches write thread pool metrics from the ABFS client and adds them to
the tracing context.
Review Comment:
Common code Can be moved to base IngressHandler
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java:
##########
@@ -242,6 +253,18 @@ public AzureIngressHandler getIngressHandler() {
private volatile boolean switchCompleted = false;
+ /**
+ * Starts CPU monitoring in the thread pool size manager if it
+ * is initialized and not already monitoring.
+ */
+ private void initializeMonitoringIfNeeded() {
+ if (poolSizeManager != null && !poolSizeManager.isMonitoringStarted()) {
Review Comment:
Do we need to repeat this check inside synchronized block as well?
What if 2 threads check together and both try to startCPUMonitoring?
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java:
##########
@@ -1346,7 +1346,13 @@ public AbfsRestOperation read(final String path,
final AbfsUriQueryBuilder abfsUriQueryBuilder =
createDefaultUriQueryBuilder();
String sasTokenForReuse = appendSASTokenToQuery(path,
SASTokenProvider.READ_OPERATION,
abfsUriQueryBuilder, cachedSasToken);
-
+ // Retrieve the read thread pool metrics from the ABFS counters.
Review Comment:
Don't we need a similar change in append as well?
##########
hadoop-tools/hadoop-azure/pom.xml:
##########
@@ -632,6 +632,8 @@
<exclude>**/azurebfs/ITestSmallWriteOptimization.java</exclude>
<exclude>**/azurebfs/ITestAbfsStreamStatistics*.java</exclude>
<exclude>**/azurebfs/services/ITestReadBufferManager.java</exclude>
+
<exclude>**/azurebfs/WriteThreadPoolSizeManager.java</exclude>
Review Comment:
Why this change?
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsWriteThreadPoolMetrics.java:
##########
@@ -0,0 +1,165 @@
+/**
Review Comment:
Same here.
A lot ff redundant code can be combined.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]