adriangb opened a new pull request, #21949: URL: https://github.com/apache/datafusion/pull/21949
## Which issue does this PR close? - Closes #21835. - Supersedes the draft #21929 (re-implementation on a cleaner foundation — no `FromProto`/`TryFromProto` workaround in the end state). ## Rationale for this change `datafusion-proto` serializes every built-in `PhysicalExpr` through a single ~300-line `downcast_ref` chain in `serialize_physical_expr_with_converter`, with a symmetric `match` on `ExprType` in `from_proto.rs`. Because the serializer lives outside the crate where each expression is defined, **every piece of internal state an expression wants to round-trip has to be `pub`**. The motivating example was #21807, which had to expose `pub struct Inner`, `pub fn inner()`, `pub fn from_parts()`, etc. on `DynamicFilterPhysicalExpr` — all "warning: not stable; proto-only". This PR puts the infrastructure in place so future stateful expressions don't hit that wall, **and** does the architectural cleanup written up at [#21835 (comment)](https://github.com/apache/datafusion/issues/21835#issuecomment-4348350257) in the same stack — so the end state has no `FromProto`/`TryFromProto` workaround traits at all. ## What changes are included in this PR? 11 stacked commits, two phases: ### Phase 1 — Architectural cleanup (commits 1–7) 1. **Rename `datafusion-proto-common` → `datafusion-proto-models`.** Pure rename via `git mv`; package, lib name, and all consumer references updated. 2. **Absorb `datafusion.proto` into proto-models.** Both protobuf packages now live in a single crate; `datafusion-proto/src/generated` shrinks to a thin re-export. Introduces `PhysicalPlanNodeExt` for the inherent-impl-on-foreign-type cases (genuinely required) and adds transient `FromProto` / `TryFromProto` traits in `datafusion_proto::convert` for the foreign-foreign `From`/`TryFrom` cases (deleted by C8). 3. **Move common-target conv impls into proto-models.** `JoinType`, `JoinConstraint`, `NullEquality`, `UnnestOptions`, `TableReference`, `StringifiedPlan` — back to standard `(&p).try_into()?` everywhere they're called. 4. **Add `proto` feature to `datafusion-expr`, host expr-target conv impls.** `WindowFrame`, `WindowFrameBound`, `WindowFrameUnits`, `WriteOp`, `NullTreatment`. 5. **Add `proto` feature to `datafusion-datasource`, host file impls.** `PartitionedFile`, `FileGroup`, `FileRange`, `FileSinkConfig`. 6. **Add `proto` feature to `datasource-{csv,json,parquet}`, host sink impls.** `JsonSink`, `CsvSink`, `ParquetSink`. 7. **Delete the transient `FromProto` / `TryFromProto` traits.** `datafusion/proto/src/convert.rs` is gone; the `convert_required_proto!` macro is gone; `lib.rs` no longer exports them. End-state: callers see plain `(&p).try_into()?` and `y.into()` everywhere, matching the target architecture in the issue comment. (One pragmatic deviation: proto-models keeps its `datafusion-common` dep rather than breaking it via a `datafusion-common[proto]` feature — common-target impls live in proto-models instead. Avoids the cycle without the Error/`FromOptionalField` refactor; trades a slightly heavier proto-models for ~1500 lines less churn.) ### Phase 2 — `PhysicalExpr::to_proto` hook (commits 8–11) 8. **Add `PhysicalExpr::to_proto` hook (feature-gated).** New `proto` feature on `datafusion-physical-expr-common`: ```rust #[cfg(feature = "proto")] fn to_proto( &self, ctx: &PhysicalExprEncodeCtx<'_>, ) -> Result<Option<PhysicalExprNode>> { Ok(None) } ``` `Ok(None)` (the default) means "fall through" — `datafusion-proto`'s downcast chain handles the expression. `Ok(Some(node))` lets the expression serialize itself, so types with private state (`DynamicFilterPhysicalExpr`'s `RwLock`-wrapped inner, etc.) won't need pub-for-proto accessors when they migrate. 9. **Migrate `Column` to the `to_proto` hook.** First expression. Removes the `Column` arm from datafusion-proto's downcast chain. 10. **Wrap encoder in `PhysicalExprEncodeCtx` struct.** The bare encoder trait moves behind a concrete struct so expression authors don't see `&dyn` in their signatures. Stable surface for adding helpers (UDF/UDAF/UDWF encoding, registry hooks) later without expanding a public trait. 11. **Add `PhysicalExprDecodeCtx` and migrate `Column` decode.** Decode-side counterpart with a single `decode(node)` method. `impl Column { fn try_from_proto(node, ctx) -> Result<Arc<Self>> }`. `Column`'s arm in the central match dispatches via `Column::try_from_proto`. ## Are these changes tested? Existing `roundtrip_logical_plan` and `roundtrip_physical_plan` tests cover wire-compatibility throughout. **161 proto tests pass at every commit**; 6 pre-existing failures need the `parquet-testing` git submodule and are unrelated to this change. No new tests were added — no new behavior was introduced. The migrated `Column` produces and consumes the same wire format as before; all conv impls are byte-for-byte equivalent in their new homes. ## Are there any user-facing changes? Yes, two API breaks in `datafusion-proto`: - `protobuf::PhysicalPlanNode::try_from_physical_plan_with_converter` (and `try_into_physical_plan_with_converter`) become methods on a new `PhysicalPlanNodeExt` trait. External callers need `use datafusion_proto::physical_plan::PhysicalPlanNodeExt;`. Genuinely required because the prost types are now foreign to `datafusion-proto`. - `protobuf::FileGroup::try_from(&[PartitionedFile])` is now `protobuf::FileGroup::try_from(&FileGroup)` (slice of foreign type can't satisfy the orphan rule in `datafusion-datasource`). The new `proto` features on `datafusion-common`-adjacent crates (expr, datasource, datasource-{csv,json,parquet}, physical-expr-common, physical-expr) are **off by default**; crates that don't serialize plans pay nothing. `datafusion-proto` flips them on. The crate rename `datafusion-proto-common` → `datafusion-proto-models` is the main visible churn for downstream consumers — `datafusion-ffi` uses the new path; in-tree examples and benchmarks are updated. ## Notes for reviewers - Each commit is independently reviewable and keeps the workspace compiling. The diff-minimization rule was applied (`git mv` for moves, semantic changes in follow-up commits) where it preserved history clearly. - Only `Column` is migrated to the `to_proto` hook. Other built-ins (`Literal`, `BinaryExpr`, etc.) and `DynamicFilterPhysicalExpr` are intentionally left on the existing downcast/match path; their migration is purely additive in follow-up PRs and doesn't change wire format. - Decode side uses a single-method `decode(node)` API on the ctx, matching the encode side's `to_proto(&ctx)` shape — both built-in and third-party expressions look the same when recursing into children. Third parties still need `PhysicalExtensionCodec` for the registry / dispatch role; an inventory-style `register::<T>()` builder is feasible later without further public-API churn. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
