Re: OOM Error Map output copy.

Koji Noguchi Sat, 10 Dec 2011 00:13:40 -0800

Hi Niranjan, 

If you're using jvm version of 1.6.018 or newer, then setting


-XX:NewRatio=8

might help.

Some background. 
http://www.mail-archive.com/[email protected]/msg11286.html

>> 4. mapred.job.shuffle.input.buffer.percent    --> 0.70
>> 5. mapred.job.shuffle.merge.percent    --> 0.66
>>
These conf values may be too high after jvm's default NewRatio changed to 2.

Koji


On 12/9/11 10:29 AM, "Arun C Murthy" <[email protected]> wrote:

> Moving to mapreduce-user@, bcc common-user@. Please use project specific
> lists.
> 
> Niranjan,
> 
> If you average as 0.5G output per-map, it's 5000 maps *0.5G -> 2.5TB over 12
> reduces i.e. nearly 250G per reduce - compressed!
> 
> If you think you have 4:1 compression you are doing nearly a Terabyte per
> reducer... which is way too high!
> 
> I'd recommend you bump to somewhere along 1000 reduces to get to 2.5G
> (compressed) per reducer for your job. If your compression ratio is 2:1, try
> 500 reduces and so on.
> 
> If you are worried about other users, use the CapacityScheduler and submit
> your job to a queue with a small capacity and max-capacity to restrict your
> job to 10 or 20 concurrent reduces at a given point.
> 
> Arun
> 
> On Dec 7, 2011, at 10:51 AM, Niranjan Balasubramanian wrote:
> 
>> All 
>> 
>> I am encountering the following out-of-memory error during the reduce phase
>> of a large job.
>> 
>> Map output copy failure : java.lang.OutOfMemoryError: Java heap space
>> at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMem
>> ory(ReduceTask.java:1669)
>> at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput
>> (ReduceTask.java:1529)
>> at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(R
>> educeTask.java:1378)
>> at 
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTa
>> sk.java:1310)
>> I tried increasing the memory available using mapped.child.java.opts but that
>> only helps a little. The reduce task eventually fails again. Here are some
>> relevant job configuration details:
>> 
>> 1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers
>> filter out a small percentage of the input ( less than 1%).
>> 
>> 2. I am currently using 12 reducers and I can't increase this count by much
>> to ensure availability of reduce slots for other users.
>> 
>> 3. mapred.child.java.opts --> -Xms512M -Xmx1536M -XX:+UseSerialGC
>> 
>> 4. mapred.job.shuffle.input.buffer.percent --> 0.70
>> 
>> 5. mapred.job.shuffle.merge.percent --> 0.66
>> 
>> 6. mapred.inmem.merge.threshold --> 1000
>> 
>> 7. I have nearly 5000 mappers which are supposed to produce LZO compressed
>> outputs. The logs seem to indicate that the map outputs range between 0.3G to
>> 0.8GB. 
>> 
>> Does anything here seem amiss? I'd appreciate any input of what settings to
>> try. I can try different reduced values for the input buffer percent and the
>> merge percent.  Given that the job runs for about 7-8 hours before crashing,
>> I would like to make some informed choices if possible.
>> 
>> Thanks. 
>> ~ Niranjan.
>> 
>> 
>> 
>

Re: OOM Error Map output copy.

Reply via email to