jjenkins278 opened a new issue, #47591: URL: https://github.com/apache/arrow/issues/47591
### Describe the bug, including details regarding any error messages, version, and platform. I was experimenting with the effect of gzip compression levels on parquet encoding (writing to /dev/null) and came across an odd bit of behavior - lower compression levels ended up being significantly slower than higher levels, whereas I would expect the opposite. Browsing through the code I found that the compression level configuration is currently being applied to the `memLevel` argument of `deflateInit2`, not `level`, namely in the following two locations: https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/compression_zlib.cc#L199 https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/compression_zlib.cc#L346 Per [zlib](https://www.zlib.net/manual.html) documentation: > The memLevel parameter specifies how much memory should be allocated for the internal compression state. memLevel=1 uses minimum memory but is slow and reduces compression ratio; memLevel=9 uses maximum memory for optimal speed. The default value is 8. I believe the intention behind `compression_level` is or should be to affect `level`, as it does in the other compressors. If so, can the callsites be updated accordingly? ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
