Hi all,

I'm quite new to Hadopp and only worked with a single node setup so far.  I 
wrote a local driver that submits Jobs to my cluster.  I instantiate a single 
Configuration instance right at the start of my process, and pass it around 
like that:

public static void main(String[] args) {
  int exitCode = ERR_INVALID_ARGS.get();
 Configuration conf = new Configuration(true);

 try {
   if (- == args[0].charAt(0)) {
     exitCode = runSelectedTool(conf, args);
   } else  if (args.length >= 2 && args.length <= 3) {
     exitCode = crunchFullDataset(conf, args);
   }
 } catch (Exception e){
      e.printStackTrace();
      exitCode = ERR_FATAL.get();
  }

  System.exit(exitCode);
}

private static int runSelectedTool(Configuration conf, String[] args) throws 
Exception {
    int exitCode;
    String toolSwitch = args[0];
    args = Arrays.copyOfRange(args,1,args.length);

    if (SWITCH_FORMATTER_COUNTER.equals(toolSwitch)) {
        exitCode = ToolRunner.run(conf, new FormatterCounter(), args);
    } else if (SWITCH_CANDIDATES_FILTER.equals(toolSwitch)) {
        exitCode = ToolRunner.run(conf, new CandidatesFilter(), args);
    }
}

Prior to this, I was instantiating a new conf object each time I called 
ToolRunner.run(), but now I use conf.set() & get() to pass values between jobs. 
 Is it a bad idea (and why), or this the right way to proceed?

Many thanks,
Pierre

Reply via email to