Samir, This should be possible. One way is:
Your custom RecordReader initializations would need to check if a file exists before it tries to create one, and upon existence it needs to simply pass through with 0 records to map(…) -- thereby satisfying what you want to do. You may also want to remove away output directory existence checks from your subclassed FileOutputFormat (Override #checkOutputSpecs). On 25-Nov-2011, at 5:24 AM, Samir Eljazovic wrote: > Hi all, > I was wandering if there is a off-the-shelf solution to re-use the output of > the job which was killed when re-running the job? > > Here's my use-case: Job (with map phase only) is running and has 60% of its > work completed before it gets killed. Output files from successfully > completed tasks will be created in specified output directory. The next time > when I re-run this job using same input data I would like to re-use those > files to skip processing data which was already processed. > > Do you know if something similar exists and what would be right way to do it? > > Thanks, > Samir
