Re: freeze a mapreduce job

Robert Evans Fri, 11 May 2012 08:59:21 -0700

There is an idle timeout for map/reduce tasks.  If a task makes no progress for 
10 min (Default) the AM will kill it on 2.0 and the JT will kill it on 1.0.  
But I don't know of anything associated with a Job, other then in 0.23 is the 
AM does not heart beat back in for too long, I believe that the RM may kill it 
and retry, but I don't know for sure.


--Bobby Evans

On 5/11/12 10:53 AM, "Harsh J" <[email protected]> wrote:

Am not aware of a job-level timeout or idle monitor.

On Fri, May 11, 2012 at 7:33 PM, Shi Yu <[email protected]> wrote:
> Is there any risk to suppress a job too long in FS?    I guess there are
> some parameters to control the waiting time of a job (such as timeout
> ,etc.),   for example, if a job is kept idle for more than 24 hours is there
> a configuration deciding kill/keep that job?
>
> Shi
>
>
> On 5/11/2012 6:52 AM, Rita wrote:
>>
>> thanks.  I think I will investigate capacity scheduler.
>>
>>
>> On Fri, May 11, 2012 at 7:26 AM, Michael
>> Segel<[email protected]>wrote:
>>
>>> Just a quick note...
>>>
>>> If your task is currently occupying a slot,  the only way to release the
>>> slot is to kill the specific task.
>>> If you are using FS, you can move the task to another queue and/or you
>>> can
>>> lower the job's priority which will cause new tasks to spawn  slower than
>>> other jobs so you will eventually free up the cluster.
>>>
>>> There isn't a way to 'freeze' or stop a job mid state.
>>>
>>> Is the issue that the job has a large number of slots, or is it an issue
>>> of the individual tasks taking a  long time to complete?
>>>
>>> If its the latter, you will probably want to go to a capacity scheduler
>>> over the fair scheduler.
>>>
>>> HTH
>>>
>>> -Mike
>>>
>>> On May 11, 2012, at 6:08 AM, Harsh J wrote:
>>>
>>>> I do not know about the per-host slot control (that is most likely not
>>>> supported, or not yet anyway - and perhaps feels wrong to do), but the
>>>> rest of the needs can be doable if you use schedulers and
>>>> queues/pools.
>>>>
>>>> If you use FairScheduler (FS), ensure that this job always goes to a
>>>> special pool and when you want to freeze the pool simply set the
>>>> pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
>>>> tasks as you wish, to constrict instead of freeze. When you make
>>>> changes to the FairScheduler configs, you do not need to restart the
>>>> JT, and you may simply wait a few seconds for FairScheduler to refresh
>>>> its own configs.
>>>>
>>>> More on FS at
>>>
>>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html
>>>>
>>>> If you use CapacityScheduler (CS), then I believe you can do this by
>>>> again making sure the job goes to a specific queue, and when needed to
>>>> freeze it, simply set the queue's maximum-capacity to 0 (percentage)
>>>> or to constrict it, choose a lower, positive percentage value as you
>>>> need. You can also refresh CS to pick up config changes by refreshing
>>>> queues via mradmin.
>>>>
>>>> More on CS at
>>>
>>> http://hadoop.apache.org/common/docs/current/capacity_scheduler.html
>>>>
>>>> Either approach will not freeze/constrict the job immediately, but
>>>> should certainly prevent it from progressing. Meaning, their existing
>>>> running tasks during the time of changes made to scheduler config will
>>>> continue to run till completion but further tasks scheduling from
>>>> those jobs shall begin seeing effect of the changes made.
>>>>
>>>> P.s. A better solution would be to make your job not take as many
>>>> days, somehow? :-)
>>>>
>>>> On Fri, May 11, 2012 at 4:13 PM, Rita<[email protected]>  wrote:
>>>>>
>>>>> I have a rather large map reduce job which takes few days. I was
>>>
>>> wondering
>>>>>
>>>>> if its possible for me to freeze the job or make the job less
>>>
>>> intensive. Is
>>>>>
>>>>> it possible to reduce the number of slots per host and then I can
>>>
>>> increase
>>>>>
>>>>> them overnight?
>>>>>
>>>>>
>>>>> tia
>>>>>
>>>>> --
>>>>> --- Get your facts first, then you can distort them as you please.--
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>
>



--
Harsh J

Re: freeze a mapreduce job

Reply via email to