ametel01 opened a new pull request, #23172: URL: https://github.com/apache/datafusion/pull/23172
## Which issue does this PR close? - Closes #23171. ## Rationale for this change Imported Substrait physical plans currently turn `ReadRel.LocalFiles` entries into a local filesystem-backed Parquet `DataSourceExec` without requiring the embedding host to approve the referenced local paths. In hosts that accept physical Substrait plans from lower-trust callers, that can let serialized plan input select process-local Parquet files outside intended dataset roots. This change makes local file access during physical plan import explicit instead of ambient. ## What changes are included in this PR? - Adds `PhysicalPlanConsumerOptions` for Substrait physical plan import. - Keeps `from_substrait_rel` as a default-deny wrapper for local file imports. - Adds `from_substrait_rel_with_options` so callers can opt in with allowed local file roots. - Canonicalizes imported local file paths and configured roots before comparing them. - Rejects local file globs and folders in physical plan import rather than accepting them without a policy. - Updates physical roundtrip tests to pass explicit allowed roots. - Adds regression coverage for missing allowed roots and paths outside the allowed root. ## Are these changes tested? Yes. - `cargo fmt --all --check` - `cargo test -p datafusion-substrait` - `cargo clippy -p datafusion-substrait --all-targets --all-features -- -D warnings` - `cargo clippy --all-targets --all-features -- -D warnings` ## Are there any user-facing changes? Yes. Substrait physical plan consumers that import `ReadRel.LocalFiles` now need to call `from_substrait_rel_with_options` and explicitly configure allowed local file roots. The existing `from_substrait_rel` API no longer imports local files by default. This is an intentional security hardening change. It may require the `api change` label because it changes behavior for the existing public API and adds a new opt-in API. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
