SequenceFileInputFormat doesn't return whole records

Tim Fletcher Fri, 19 Aug 2011 05:32:14 -0700

Hi all,

I am having issues using SequenceFileInputFormat to retrieve whole records


I have 1 job that is used to write to a SequenceFile

SequenceFileOutputFormat.setOutputPath(job, new Path("out/data"));
 SequenceFileOutputFormat.setOutputCompressionType(job,
SequenceFile.CompressionType.NONE);

I then have a second job that is ment to read the file for processing

SequenceFileInputFormat.addInputPath(job, new Path("out/data"));

However, the values that i get as the arguments to the Map part of my job
only seems to contain parts of the record. I am sure that i am missing
something rather fundamental as to how Hadoop splits inputs to the Mapper,
but can't seem to find a way to stop the records being split.

Any help (or a pointer to a specific page in the doc) would be greatly
appreciated

Regards,
Tim

SequenceFileInputFormat doesn't return whole records

Reply via email to