So, the initial input stream is decompressed, then each temporary file gets
compressed and decompressed, and the avro output is then recompressed,
decompressed again at the reducers?
I'm counting 2 compressions and 2 decompressions at the mappers and 1
decompression at the reducers.
Am I getting this right?
- Tim.
________________________________________
From: Arun C Murthy [[email protected]]
Sent: Thursday, January 12, 2012 3:15 PM
To: [email protected]
Subject: Re: Can spill to disk be in compressed format to reduce I/O?
Temporary map-ouput files don't use Avro format. There is a custom format which
should be compressed if you set mapred.compress.map.output.
Arun
On Jan 12, 2012, at 8:08 AM, Frank Grimes wrote:
> I tried conf.setBoolean("mapred.compress.map.output", true); but it didn't
> seem to work.
>
> Also, since I'm using the Avro mapred APIs, maybe there's something Avro
> specific to get it enabled?
> Should I ask on the Avro mailing lists?
>
> Thanks,
>
> Frank Grimes
>
>
> On 2012-01-12, at 10:49 AM, [email protected] wrote:
>
>> Hi Frank
>> Is map output compression enabled?
>>
>> The config param would be like
>> mapred.map.output.compress=true
>> (It is from my memory, Please cross check)
>>
>> ------Original Message------
>> From: Frank Grimes
>> To: [email protected]
>> ReplyTo: [email protected]
>> Subject: Can spill to disk be in compressed format to reduce I/O?
>> Sent: Jan 12, 2012 21:10
>>
>> Hi All,
>>
>> We're trying to speed up an M/R job which combines multiple .avro files.
>> We've noticed that when it spills to disk, it's in uncompressed format.
>> Is there a way to make it spill temporary segments as .avro with Deflate
>> compression?
>>
>> Thanks,
>>
>> Frank Grimes
>>
>> Regards
>> Bejoy K S
>
The information and any attached documents contained in this message
may be confidential and/or legally privileged. The message is
intended solely for the addressee(s). If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful. If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.