Hi Keith,

What is the Hadoop version you are using? Judging from the log, it could be
a bug in the Capacity scheduler[1].
Also, have you look at the node manager log of the node "worker14:40196"?

[1] https://issues.apache.org/jira/browse/YARN-2628

Terence

On Wed, May 4, 2016 at 8:44 AM, Keith Turner <[email protected]> wrote:

> I ran into an issue where Yarn does not seem to be starting container again
> for an application after some containers died.  The details of the issue I
> am running into are outlined in fluo#657 [1].
>
> Twill seems to be trying to restart the containers, but it seems YARN is
> not doing it.   Looking at the YARN RM web page there are enough cores and
> memory available to start the containers, so I am not sure why its not
> starting them.
>
> Does anyone has any tips for debugging this issue or hve a second to look
> at the logs attached to fluo#657?
>
> [1] : https://github.com/fluo-io/fluo/issues/657
>

Reply via email to