The CapacityScheduler (hadoop-0.20.203 onwards) allows you to stop a queue and 
start it again.

That will give you the behavior you described.

Arun

On Dec 12, 2011, at 5:50 AM, Dino Kečo wrote:

> Hi Hadoop users,
> 
> In my company we have been using Hadoop for 2 years and we have need to pause 
> and resume map reduce jobs. I was searching on Hadoop JIRA and there are 
> couple of tickets which are not resolved. So we have implemented our 
> solution. I would like to share this approach with you and to hear your 
> opinion about it.
> 
> We have created one special pool in fair scheduler called PAUSE (maxMapTasks 
> = 0, maxReduceTasks = 0). Our logic for pausing job is to move it into this 
> pool and kill all running tasks. When we want to resume job we move this job 
> into some other pool. Currently we can do maintenance of cloud except Job 
> Tracker while jobs are paused. Also we have some external services which we 
> use and we are doing their maintenance while jobs are paused. 
> 
> We know that records which are processed by running tasks will be 
> reprocessed. In some cases we use same HBase table as input and output and we 
> save job id on record. When record is re-processes we check this job id and 
> skip record if it is processed by same job. 
> 
> Our custom implementation of fair scheduler have this logic implemented and 
> it is deployed to our cluster. 
> 
> Please share your comments and concerns about this approach 
> 
> Regards,
> dino

Reply via email to