No output when more than one reduce task

ttc Tue, 14 Aug 2012 15:39:54 -0700

I have written a MapReduce application running on 5 machines with Hadoop
1.0.2 installed. The App uses my custom FileInputFormat, RecordReader,
TextOutputFormat, etc with Primary and secondary sort keys for reduce-side
join.  It works well when I set the number of reduce tasks to 1.  When I use
more than one reduce task, my output file comes out empty even though the
status output shows I have Map Reduce output (Map output records=8268133). 
I did not use any Combiner.  Does anyone know what I am doing wrong?  I
appreciate your help.


Here is the output status.


12/08/14 18:36:18 INFO mapred.JobClient:  map 100% reduce 100%
12/08/14 18:36:23 INFO mapred.JobClient: Job complete: job_201208081134_0075
12/08/14 18:36:23 INFO mapred.JobClient: Counters: 28
12/08/14 18:36:23 INFO mapred.JobClient:   Job Counters 
12/08/14 18:36:23 INFO mapred.JobClient:     Launched reduce tasks=2
12/08/14 18:36:23 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=241075
12/08/14 18:36:23 INFO mapred.JobClient:     Total time spent by all reduces
waiting after reserving slots (ms)=0
12/08/14 18:36:23 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
12/08/14 18:36:23 INFO mapred.JobClient:     Launched map tasks=18
12/08/14 18:36:23 INFO mapred.JobClient:     Data-local map tasks=18
12/08/14 18:36:23 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=166309
12/08/14 18:36:23 INFO mapred.JobClient:   File Output Format Counters 
12/08/14 18:36:23 INFO mapred.JobClient:     Bytes Written=0
12/08/14 18:36:23 INFO mapred.JobClient:   FileSystemCounters
12/08/14 18:36:23 INFO mapred.JobClient:     FILE_BYTES_READ=2244635327
12/08/14 18:36:23 INFO mapred.JobClient:     HDFS_BYTES_READ=996834876
12/08/14 18:36:23 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=3300936988
12/08/14 18:36:23 INFO mapred.JobClient:   File Input Format Counters 
12/08/14 18:36:23 INFO mapred.JobClient:     Bytes Read=996832672
12/08/14 18:36:23 INFO mapred.JobClient:   Map-Reduce Framework
12/08/14 18:36:23 INFO mapred.JobClient:     Map output materialized
bytes=1055918357
12/08/14 18:36:23 INFO mapred.JobClient:     Map input records=8268133
12/08/14 18:36:23 INFO mapred.JobClient:     Reduce shuffle bytes=1055918357
12/08/14 18:36:23 INFO mapred.JobClient:     Spilled Records=25895806
12/08/14 18:36:23 INFO mapred.JobClient:     Map output bytes=1039334953
12/08/14 18:36:23 INFO mapred.JobClient:     CPU time spent (ms)=357670
12/08/14 18:36:23 INFO mapred.JobClient:     Total committed heap usage
(bytes)=3258384384
12/08/14 18:36:23 INFO mapred.JobClient:     Combine input records=0
12/08/14 18:36:23 INFO mapred.JobClient:     SPLIT_RAW_BYTES=2204
12/08/14 18:36:23 INFO mapred.JobClient:     Reduce input records=8268133
12/08/14 18:36:23 INFO mapred.JobClient:     Reduce input groups=1294180
12/08/14 18:36:23 INFO mapred.JobClient:     Combine output records=0
12/08/14 18:36:23 INFO mapred.JobClient:     Physical memory (bytes)
snapshot=3936837632
12/08/14 18:36:23 INFO mapred.JobClient:     Reduce output records=0
12/08/14 18:36:23 INFO mapred.JobClient:     Virtual memory (bytes)
snapshot=10375229440
12/08/14 18:36:23 INFO mapred.JobClient:     Map output records=8268133

-- 
View this message in context: 
http://old.nabble.com/No-output-when-more-than-one-reduce-task-tp34299347p34299347.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

No output when more than one reduce task

Reply via email to