Re: different input/output formats

samir das mohapatra Wed, 30 May 2012 11:32:31 -0700

Hi
  I think attachment will not got thgrough the [email protected].


Ok Please have a look bellow.

MAP
------------------------
package test;

import java.io.IOException;

import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class myMapper extends MapReduceBase implements
Mapper<LongWritable,Text,FloatWritable,Text> {

   public void map(LongWritable offset, Text
val,OutputCollector<FloatWritable,Text> output, Reporter reporter)  throws
IOException {
       output.collect(new FloatWritable(1), val);
    }
}

REDUCER
------------------------------
Prepare reducer  what exactly you want for.



JOB
------------------------

package test;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.filecache.DistributedCache;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.SequenceFileOutputFormat;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class TestDemo extends Configured implements Tool{

    public static void main(String args[]) throws Exception{

            int res = ToolRunner.run(new Configuration(), new
TestDemo(),args);
            System.exit(res);

    }

    @Override
    public int run(String[] args) throws Exception {
        JobConf conf = new JobConf(TestDemo.class);
        String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
        conf.setJobName("TestCustomInputOutput");


           conf.setMapperClass(myMapper.class);
           conf.setMapOutputKeyClass(FloatWritable.class);
           conf.setMapOutputValueClass(Text.class);
           conf.setNumReduceTasks(0);
           conf.setOutputKeyClass(FloatWritable.class);
           conf.setOutputValueClass(Text.class);

           conf.setInputFormat(TextInputFormat.class);
           conf.setOutputFormat(SequenceFileOutputFormat.class);

           TextInputFormat.addInputPath(conf, new Path(args[0]));
           SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));

        JobClient.runJob(conf);
        return 0;
    }
}

On Wed, May 30, 2012 at 6:57 PM, samir das mohapatra <
[email protected]> wrote:

> PFA.
>
>
> On Wed, May 30, 2012 at 2:45 AM, Mark question <[email protected]>wrote:
>
>> Hi Samir, can you email me your main class.. or if you can check mine, it
>> is as follows:
>>
>> public class SortByNorm1 extends Configured implements Tool {
>>
>>    @Override public int run(String[] args) throws Exception {
>>
>>        if (args.length != 2) {
>>            System.err.printf("Usage:bin/hadoop jar norm1.jar <inputDir>
>> <outputDir>\n");
>>            ToolRunner.printGenericCommandUsage(System.err);
>>            return -1;
>>        }
>>        JobConf conf = new JobConf(new Configuration(),SortByNorm1.class);
>>        conf.setJobName("SortDocByNorm1");
>>        conf.setMapperClass(Norm1Mapper.class);
>>        conf.setMapOutputKeyClass(FloatWritable.class);
>>        conf.setMapOutputValueClass(Text.class);
>>        conf.setNumReduceTasks(0);
>>        conf.setReducerClass(Norm1Reducer.class);
>>         conf.setOutputKeyClass(FloatWritable.class);
>>        conf.setOutputValueClass(Text.class);
>>
>>        conf.setInputFormat(TextInputFormat.class);
>>        conf.setOutputFormat(SequenceFileOutputFormat.class);
>>
>>        TextInputFormat.addInputPath(conf, new Path(args[0]));
>>        SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));
>>         JobClient.runJob(conf);
>>        return 0;
>>    }
>>    public static void main(String[] args) throws Exception {
>>        int exitCode = ToolRunner.run(new SortByNorm1(), args);
>>        System.exit(exitCode);
>>     }
>>
>>
>> On Tue, May 29, 2012 at 1:55 PM, samir das mohapatra <
>> [email protected]> wrote:
>>
>> > Hi Mark
>> >    See the out put for that same  Application .
>> >    I am  not getting any error.
>> >
>> >
>> > On Wed, May 30, 2012 at 1:27 AM, Mark question <[email protected]
>> >wrote:
>> >
>> >> Hi guys, this is a very simple  program, trying to use TextInputFormat
>> and
>> >> SequenceFileoutputFormat. Should be easy but I get the same error.
>> >>
>> >> Here is my configurations:
>> >>
>> >>        conf.setMapperClass(myMapper.class);
>> >>        conf.setMapOutputKeyClass(FloatWritable.class);
>> >>        conf.setMapOutputValueClass(Text.class);
>> >>        conf.setNumReduceTasks(0);
>> >>        conf.setOutputKeyClass(FloatWritable.class);
>> >>        conf.setOutputValueClass(Text.class);
>> >>
>> >>        conf.setInputFormat(TextInputFormat.class);
>> >>        conf.setOutputFormat(SequenceFileOutputFormat.class);
>> >>
>> >>        TextInputFormat.addInputPath(conf, new Path(args[0]));
>> >>        SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));
>> >>
>> >>
>> >> myMapper class is:
>> >>
>> >> public class myMapper extends MapReduceBase implements
>> >> Mapper<LongWritable,Text,FloatWritable,Text> {
>> >>
>> >>    public void map(LongWritable offset, Text
>> >> val,OutputCollector<FloatWritable,Text> output, Reporter reporter)
>> >>    throws IOException {
>> >>        output.collect(new FloatWritable(1), val);
>> >>     }
>> >> }
>> >>
>> >> But I get the following error:
>> >>
>> >> 12/05/29 12:54:31 INFO mapreduce.Job: Task Id :
>> >> attempt_201205260045_0032_m_000000_0, Status : FAILED
>> >> java.io.IOException: wrong key class:
>> org.apache.hadoop.io.LongWritable is
>> >> not class org.apache.hadoop.io.FloatWritable
>> >>    at
>> >> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:998)
>> >>    at
>> >>
>> >>
>> org.apache.hadoop.mapred.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:75)
>> >>    at
>> >>
>> >>
>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.collect(MapTask.java:705)
>> >>    at
>> >>
>> >>
>> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:508)
>> >>    at
>> >>
>> >>
>> filter.stat.cosine.preprocess.SortByNorm1$Norm1Mapper.map(SortByNorm1.java:59)
>> >>    at
>> >>
>> >>
>> filter.stat.cosine.preprocess.SortByNorm1$Norm1Mapper.map(SortByNorm1.java:1)
>> >>    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>> >>    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397)
>> >>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
>> >>    at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
>> >>    at java.security.AccessController.doPrivileged(Native Method)
>> >>    at javax.security.auth.Subject.doAs(Subject.java:396)
>> >>    at org.apache.hadoop.security.Use
>> >>
>> >> Where is the writing of LongWritable coming from ??
>> >>
>> >> Thank you,
>> >> Mark
>> >>
>> >
>> >
>>
>
>

Re: different input/output formats

Reply via email to