That works perfectly well, thank you Robert! On Thu, Aug 2, 2012 at 1:09 PM, Robert Evans <[email protected]> wrote:
> The default text input format has a key of a LongWritable that is the > offset into the file. The value is the full line. > > On 8/2/12 2:59 PM, "Harit Himanshu" <[email protected]> wrote: > > >StackOverflow link - > > > http://stackoverflow.com/questions/11784729/hadoop-java-lang-classcastexce > >ption-org-apache-hadoop-io-longwritable-cannot > > > >---------- > > > >My program looks like > > > >public class TopKRecord extends Configured implements Tool { > > > > public static class MapClass extends Mapper<Text, Text, Text, Text> { > > > > public void map(Text key, Text value, Context context) throws > >IOException, InterruptedException { > > // your map code goes here > > String[] fields = value.toString().split(","); > > String year = fields[1]; > > String claims = fields[8]; > > > > if (claims.length() > 0 && (!claims.startsWith("\""))) { > > context.write(new Text(year.toString()), new > >Text(claims.toString())); > > } > > } > > } > > public int run(String args[]) throws Exception { > > Job job = new Job(); > > job.setJarByClass(TopKRecord.class); > > > > job.setMapperClass(MapClass.class); > > > > FileInputFormat.setInputPaths(job, new Path(args[0])); > > FileOutputFormat.setOutputPath(job, new Path(args[1])); > > > > job.setJobName("TopKRecord"); > > job.setMapOutputValueClass(Text.class); > > job.setNumReduceTasks(0); > > boolean success = job.waitForCompletion(true); > > return success ? 0 : 1; > > } > > > > public static void main(String args[]) throws Exception { > > int ret = ToolRunner.run(new TopKRecord(), args); > > System.exit(ret); > > } > >} > > > >The data looks like > > > >"PATENT","GYEAR","GDATE","APPYEAR","COUNTRY","POSTATE","ASSIGNEE","ASSCODE > >","CLAIMS","NCLASS","CAT","SUBCAT","CMADE","CRECEIVE","RATIOCIT","GENERAL" > >,"ORIGINAL","FWDAPLAG","BCKGTLAG","SELFCTUB","SELFCTLB","SECDUPBD","SECDLW > >BD" > >3070801,1963,1096,,"BE","",,1,,269,6,69,,1,,0,,,,,,, > >3070802,1963,1096,,"US","TX",,1,,2,6,63,,0,,,,,,,,, > >3070803,1963,1096,,"US","IL",,1,,2,6,63,,9,,0.3704,,,,,,, > >3070804,1963,1096,,"US","OH",,1,,2,6,63,,3,,0.6667,,,,,,, > > > >On running this program I see the following on console > > > >12/08/02 12:43:34 INFO mapred.JobClient: Task Id : > >attempt_201208021025_0007_m_000000_0, Status : FAILED > >java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot > >be cast to org.apache.hadoop.io.Text > > at com.hadoop.programs.TopKRecord$MapClass.map(TopKRecord.java:26) > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation. > >java:1121) > > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > >I believe that the Class Types are mapped correctly, Class > >Mapper< > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/ > >mapreduce/Mapper.html> > >, > > > >Please let me know what is that I am doing wrong here? > > > > > >Thank you > > > >+ Harit Himanshu > >
