ccciudatu opened a new issue, #43469: URL: https://github.com/apache/arrow/issues/43469
### Describe the enhancement requested Application code is currently required to choose upfront between handling compressed vs. uncompressed data by specifying one of the two (mutually exclusive) `CompressionCodec.Factory` implementations: `NoCompressionCodec.Factory` and `CommonsCompressionCodecFactory`. While this is totally acceptable (or even required) for the write path (e.g. `ArrowWriter`) it makes it really tedious to support compression on the read path, as it's not reasonable to choose between handling _uncompressed-data-only_ and _compressed-data-only_ when writing (e.g.) a client app for Arrow Flight. As already reported in https://github.com/apache/arrow/issues/41457, the Java FlightClient currently fails with the following error when trying to decode a compressed stream: ``` java.lang.IllegalArgumentException: Please add arrow-compression module to use CommonsCompressionCodecFactory for LZ4_FRAME at org.apache.arrow.vector.compression.NoCompressionCodec$Factory.createCodec(NoCompressionCodec.java:63) at org.apache.arrow.vector.compression.CompressionCodec$Factory$1.createCodec(CompressionCodec.java:91) at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:79) at org.apache.arrow.flight.FlightStream.next(FlightStream.java:275) ``` The `FlightStream` class does not explicitly pass a compression codec factory when creating a `VectorLoader`, which then uses the default `NoCompressionCodec.Factory`. Changing the default to `CommonsCompressionCodecFactory` is not an option because: 1. `CommonsCompressionCodecFactory` does not support uncompressed data 2. `arrow-compression` is not a dependency for `arrow-vector` Instead of challenging these two design decisions, the proposed solution (upcoming PR) is to make the default `CompressionCodec.Factory` use a `ServiceLoader` to gather all the available implementations and combine them to support as many `CodecType`s as possible, falling back to the `NO_COMPRESSION` codec type (i.e. the same default as today). The arrow-compression module would then act as a service provider, so that whenever it's present in the module- (or class-) path, it will transparently fill in the gaps of the default factory. As a side note, this is in fact the literal meaning of the above error message (_"Please add arrow-compression module to use CommonsCompressionCodecFactory"_), so we can assume this was the original intention. ### Component(s) FlightRPC, Java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org