Hi Harsh, My situation is to kill a job when this threshold is reached. If say threshold is 10. And 2 mappers combined reached this value, how should I achieve this.
With what you are saying, I think job will fail once a single mapper reaches that threshold. Thanks, On Tue, Nov 15, 2011 at 11:22 AM, Harsh J <[email protected]> wrote: > Mapred, > > If you fail a task permanently upon encountering a bad situation, you > basically end up failing the job as well, automatically. By controlling the > number of retries (say down to 1 or 2 from 4 default total attempts), you > can also have it fail the job faster. > > Is killing the job immediately a necessity? Why? > > I s'pose you could call kill from within the mapper, but I've never seen > that as necessary in any situation so far. Whats wrong with letting the job > auto-die as a result of a failing task? > > On 16-Nov-2011, at 12:38 AM, Mapred Learn wrote: > > Thanks David for a step-by-step response but this makes error threshold, > a per mapper threshold. Is there a way to make it per job so that all > mappers share this value and increment it as a shared counter ? > > > On Tue, Nov 15, 2011 at 8:12 AM, David Rosenstrauch <[email protected]>wrote: > >> On 11/14/2011 06:06 PM, Mapred Learn wrote: >> >>> Hi, >>> >>> I have a use case where I want to pass a threshold value to a map-reduce >>> job. For eg: error records=10. >>> >>> I want map-reduce job to fail if total count of error_records in the job >>> i.e. all mappers, is reached. >>> >>> How can I implement this considering that each mapper would be processing >>> some part of the input data ? >>> >>> Thanks, >>> -JJ >>> >> >> 1) Pass in the threshold value as configuration value of the M/R job. >> (i.e., job.getConfiguration().setInt(**"error_threshold", 10) ) >> >> 2) Make your mappers implement the Configurable interface. This will >> ensure that every mapper gets passed a copy of the config object. >> >> 3) When you implement the setConf() method in your mapper (which >> Configurable will force you to do), retrieve the threshold value from the >> config and save it in an instance variable in the mapper. (i.e., int >> errorThreshold = conf.getInt("error_threshold") ) >> >> 4) In the mapper, when an error record occurs, increment a counter and >> then check if the counter value exceeds the threshold. If so, throw an >> exception. (e.g., if (++numErrors >= errorThreshold) throw new >> RuntimeException("Too many errors") ) >> >> The exception will kill the mapper. Hadoop will attempt to re-run it, >> but subsequent attempts will also fail for the same reason, and eventually >> the entire job will fail. >> >> HTH, >> >> DR >> > > >
