[
https://issues.apache.org/jira/browse/HADOOP-18639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032970#comment-18032970
]
ASF GitHub Bot commented on HADOOP-18639:
-----------------------------------------
github-actions[bot] closed pull request #5425: HADOOP-18639
DockerContainerDeletionTask is not removed from the Nodemanager's statestore
when the task is completed
URL: https://github.com/apache/hadoop/pull/5425
> DockerContainerDeletionTask is not removed from the Nodemanager's statestore
> when the task is completed.
> --------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-18639
> URL: https://issues.apache.org/jira/browse/HADOOP-18639
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 3.1.2
> Reporter: Sejin Hwang
> Priority: Major
> Labels: pull-request-available
>
> YARN NodeManager's deletion service has two types of deletion tasks: the
> FileDeletionTask for deleting log, usercache, appcache files and the
> DockerContainerDeletionTask for deleting Docker containers.
>
> The FileDeletionTask is removed from the statestore when the task is
> completed, but the DockerContainerDeletionTask is not.
> Therefore, the DockerContainerDeletionTask accumulates continuously in the
> statestore.
>
> This causes the NodeManager's deletion service to run the accumulated
> DockerContainerDeletionTask in the statestore when the NodeManager restarts.
> As a result, the FileDeletionTask and DockerContainerDeletionTask are delayed
> unnecessarily while processing accumulated tasks, which can cause disk full
> issues in environments where a large number of containers are allocated and
> released.
> I will attach a patch soon
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]