[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17647656#comment-17647656
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-----------------------------------------

steveloughran commented on code in PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#discussion_r1048802203


##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -459,6 +458,13 @@ public void initialize(URI name, Configuration 
originalConf)
     AuditSpan span = null;
     try {
       LOG.debug("Initializing S3AFileSystem for {}", bucket);
+      if (LOG.isTraceEnabled()) {
+        // log a full trace for deep diagnostics of where an object is created,
+        // for tracking down memory leak issues.
+        LOG.trace("Filesystem for {} created; fs.s3a.impl.disable.cache = {}",
+            name, originalConf.getBoolean("fs.s3a.impl.disable.cache", false),
+            new RuntimeException(super.toString()));

Review Comment:
   1. generating that runtime exception creates a stack trace which then gets 
logged in the trace call, so we can see who is creating the fs instance
   2. calling super.toString() so we get the minimal instance id rather than 
the expanded tostring with all the stats and stuff. This is initialize, most of 
those values are unset so it would only confuse.
   
   it is that stack trace which is the most critical thing as it lets us track 
down where FS instances are being created and then not closed.





> Leak of S3AInstrumentation instances via hadoop Metrics references
> ------------------------------------------------------------------
>
>                 Key: HADOOP-18526
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18526
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.4
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Blocker
>              Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to