Error using MultipleInputs

Sanchita Adhya Thu, 05 Jul 2012 04:50:24 -0700

Hi,


I am using cloudera's hadoop version - Hadoop 0.20.2-cdh3u3 and trying to
use the MultipleInputs incorporating separate mapper class in the following
manner-

 

public static void main(String[] args) throws Exception {

     JobConf conf = new JobConf(IntegrateExisting.class);

     conf.setJobName("IntegrateExisting");

 

     conf.setOutputKeyClass(Text.class);

     conf.setOutputValueClass(Text.class);

 

     Path existingKeysInputPath = new Path(args[0]);

     Path newKeysInputPath = new Path(args[1]);

    Path outputPath = new Path(args[2]);

 

     MultipleInputs.addInputPath(conf, existingKeysInputPath,
TextInputFormat.class, MapExisting.class);

     MultipleInputs.addInputPath(conf, newKeysInputPath,
TextInputFormat.class, MapNew.class);

    

     conf.setCombinerClass(ReduceAndFilterOut.class);

     conf.setReducerClass(ReduceAndFilterOut.class);

     

     conf.setInputFormat(TextInputFormat.class);

     conf.setOutputFormat(TextOutputFormat.class);

 

     FileOutputFormat.setOutputPath(conf, outputPath);

 

 

    //FileInputFormat.addInputPath(conf,existingKeysInputPath);

   //FileInputFormat.addInputPath(conf,newKeysInputPath);

 

     JobClient.runJob(conf);

   }

 

Without the commented lines in the above code, the MR job fails with the
following error-

 

12/07/05 16:59:25 ERROR security.UserGroupInformation:
PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException:
No input paths specified in job

Exception in thread "main" java.io.IOException: No input paths specified in
job

        at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:153
)

        at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:205)

        at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:971)

        at
org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:963)

        at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)

        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)

        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1157)

        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)

        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)

        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242)

        at org.myorg.IntegrateExisting.main(IntegrateExisting.java:122)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:197)

 

Uncommenting the lines, leads to the following error in the mappers-

 

java.lang.ClassCastException: org.apache.hadoop.mapred.FileSplit cannot be
cast to org.apache.hadoop.mapred.lib.TaggedInputSplit

        at
org.apache.hadoop.mapred.lib.DelegatingMapper.map(DelegatingMapper.java:48)

        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)

        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1157)

        at org.apache.hadoop.mapred.Child.main(Child.java:264)

 

I see the MAPREDUCE-1178 that discusses the second error is included in the
CDH3 version. Is there any code missing from the above piece?

 

Thanks for the help. 

 

Regards,

Sanchita

Error using MultipleInputs

Reply via email to