jjenkins278 opened a new issue, #47591:
URL: https://github.com/apache/arrow/issues/47591

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I was experimenting with the effect of gzip compression levels on parquet 
encoding (writing to /dev/null) and came across an odd bit of behavior - lower 
compression levels ended up being significantly slower than higher levels, 
whereas I would expect the opposite. Browsing through the code I found that the 
compression level configuration is currently being applied to the `memLevel` 
argument of `deflateInit2`, not `level`, namely in the following two locations:
   
   
https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/compression_zlib.cc#L199
   
https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/compression_zlib.cc#L346
   
   Per [zlib](https://www.zlib.net/manual.html) documentation:
   
   > The memLevel parameter specifies how much memory should be allocated for 
the internal compression state. memLevel=1 uses minimum memory but is slow and 
reduces compression ratio; memLevel=9 uses maximum memory for optimal speed. 
The default value is 8.
   
   I believe the intention behind `compression_level` is or should be to affect 
`level`, as it does in the other compressors. If so, can the callsites be 
updated accordingly?
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to