Samir,

This should be possible. One way is:

Your custom RecordReader initializations would need to check if a file exists 
before it tries to create one, and upon existence it needs to simply pass 
through with 0 records to map(…) -- thereby satisfying what you want to do.

You may also want to remove away output directory existence checks from your 
subclassed FileOutputFormat (Override #checkOutputSpecs).

On 25-Nov-2011, at 5:24 AM, Samir Eljazovic wrote:

> Hi all,
> I was wandering if there is a off-the-shelf solution to re-use the output of 
> the job which was killed when re-running the job?
> 
> Here's my use-case: Job (with map phase only) is running and has 60% of its 
> work completed before it gets killed. Output files from successfully 
> completed tasks will be created in specified output directory. The next time 
> when I re-run this job using same input data I would like to re-use those 
> files to skip processing data which was already processed.
> 
> Do you know if something similar exists and what would be right way to do it?
> 
> Thanks,
> Samir

Reply via email to