alamb commented on code in PR #21917:
URL: https://github.com/apache/datafusion/pull/21917#discussion_r3178154578


##########
datafusion/physical-plan/src/spill/mod.rs:
##########
@@ -290,6 +290,17 @@ struct IPCStreamWriter {
 
 impl IPCStreamWriter {
     /// Create new writer
+    ///
+    /// # Codec contract
+    ///
+    /// `arrow-ipc` must be compiled with the `lz4` and `zstd` features

Review Comment:
   I think you could probably add an explicit dependency here that would make 
the code clearer and avoid the cargo-machete shenanigans. I think this would be 
less confusing
   
   Something like
   ```diff
   
   diff --git a/datafusion/physical-plan/src/spill/mod.rs 
b/datafusion/physical-plan/src/spill/mod.rs
   index f5afce50d..8e9ca684c 100644
   --- a/datafusion/physical-plan/src/spill/mod.rs
   +++ b/datafusion/physical-plan/src/spill/mod.rs
   @@ -48,7 +48,7 @@ use arrow::ipc::{
    };
    use arrow::record_batch::RecordBatch;
    use arrow_data::ArrayDataBuilder;
   -
   +use arrow_ipc::CompressionType;
    use datafusion_common::config::SpillCompression;
    use datafusion_common::{DataFusionError, Result, exec_datafusion_err, 
exec_err};
    use datafusion_common_runtime::SpawnedTask;
   @@ -304,7 +304,7 @@ impl IPCStreamWriter {
        pub fn new(
            path: &Path,
            schema: &Schema,
   -        compression_type: SpillCompression,
   +        spill_compression: SpillCompression,
        ) -> Result<Self> {
            let file = File::create(path).map_err(|e| {
                exec_datafusion_err!("(Hint: you may increase the file 
descriptor limit with shell command 'ulimit -n 4096') Failed to create 
partition file at {path:?}: {e:?}")
   @@ -319,7 +319,8 @@ impl IPCStreamWriter {
            let alignment = get_max_alignment_for_schema(schema);
            let mut write_options =
                IpcWriteOptions::try_new(alignment, false, metadata_version)?;
   -        write_options = 
write_options.try_with_compression(compression_type.into())?;
   +        let compression_type = 
Option::<CompressionType>::from(spill_compression);
   +        write_options = 
write_options.try_with_compression(compression_type)?;
   
            let writer = StreamWriter::try_new_with_options(file, schema, 
write_options)?;
            Ok(Self {
   ```



##########
benchmarks/results/main/clickbench_1.json:
##########
@@ -0,0 +1,29 @@
+{

Review Comment:
   this change seems unrelated 🤔 



##########
datafusion/physical-plan/Cargo.toml:
##########
@@ -87,6 +95,11 @@ tokio = { workspace = true, features = [
     "parking_lot",
 ] }
 
+# `arrow-ipc` is used only through the `arrow::ipc` re-export, so cargo-machete
+# reports it as unused even though this crate relies on its codec features.
+[package.metadata.cargo-machete]

Review Comment:
   See above for a suggestion on how to simplify his



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to