adriangb opened a new pull request, #21929:
URL: https://github.com/apache/datafusion/pull/21929
## Which issue does this PR close?
- Closes #21835.
## Rationale for this change
`datafusion-proto` serializes every built-in `PhysicalExpr` through a single
~300-line `downcast_ref` chain in `serialize_physical_expr_with_converter`,
with a symmetric `match` on `ExprType` in `from_proto.rs`. Because the
serializer lives outside the crate where each expression is defined, **every
piece of internal state an expression wants to round-trip has to be `pub`**.
The motivating example was #21807, which had to expose `pub struct Inner`, `pub
fn inner()`, `pub fn from_parts()`, `pub fn original_children()`, `pub fn
remapped_children()` on `DynamicFilterPhysicalExpr` — all "warning: not stable;
proto-only".
This PR puts the infrastructure in place so future stateful expressions
don't hit that wall:
1. The prost-generated types move to a new lightweight
`datafusion-proto-models` crate.
2. `PhysicalExpr` gains an opt-in `to_proto` method (feature-gated).
3. Per-expression `try_from_proto` constructors land via a parallel
`PhysicalExprDecodeCtx`.
4. `Column` is migrated end-to-end as a working demo of the new pattern.
The remaining built-in expressions stay on the existing downcast/match path;
they migrate one-by-one in follow-ups (each migration is independent and keeps
behavior unchanged).
## What changes are included in this PR?
Five stacked commits, each independently mergeable / splittable later:
1. **Extract `datafusion-proto-models` crate.** Pure refactor — moves
`.proto`, gen, `src/generated/*` into a new lightweight crate (mirror of
`datafusion-proto-common`'s split). `datafusion-proto` re-exports the proto
types via `pub mod protobuf`, so external consumers (`datafusion-ffi`,
`datafusion-examples`, `benchmarks`) need no changes.
Because the prost types are now foreign to `datafusion-proto`, the orphan
rule forced two pieces of trait surgery:
- Inherent `impl protobuf::PhysicalPlanNode { ... }` → new
`PhysicalPlanNodeExt` trait. Callers that used
`protobuf::PhysicalPlanNode::try_from_physical_plan_with_converter(...)` need a
`use datafusion_proto::physical_plan::PhysicalPlanNodeExt;` (small API break,
all in-tree call sites updated).
- `impl From<&protobuf::X> for Y` (and `TryFrom`) for foreign-foreign
pairs → new `FromProto`/`TryFromProto` traits in `datafusion_proto::convert`.
Callers go from `(&p).into()` to `Y::from_proto(&p)`.
2. **Add `PhysicalExpr::to_proto` hook (feature-gated).** Adds an opt-in
`proto` feature on `datafusion-physical-expr-common`:
```rust
#[cfg(feature = \"proto\")]
fn to_proto(
&self,
ctx: &PhysicalExprEncodeCtx<'_>,
) -> Result<Option<PhysicalExprNode>> { Ok(None) }
```
`Ok(None)` (the default) means \"fall through\" — `datafusion-proto`'s
downcast chain handles the expression. `Ok(Some(node))` lets the expression
serialize itself, so types with private state (`DynamicFilterPhysicalExpr`'s
`RwLock`-wrapped inner, etc.) won't need pub-for-proto accessors when they
migrate.
3. **Migrate `Column` to `PhysicalExpr::to_proto`.** First expression to use
the hook. Removes its arm from the downcast chain.
4. **Wrap encoder in `PhysicalExprEncodeCtx` struct.** The bare
`PhysicalExprEncoder` trait moves behind a concrete `PhysicalExprEncodeCtx`
struct so expression authors don't see `&dyn` in their signatures. Mirrors how
`PhysicalPlanDecodeContext` is shaped today. Gives us a stable place to add
helpers (UDF/UDAF/UDWF encoding, future registry hooks) without expanding a
public trait surface.
5. **Add `PhysicalExprDecodeCtx` and migrate `Column` decode.** Decode-side
counterpart with a single-method API:
```rust
pub struct PhysicalExprDecodeCtx<'a> { ... }
impl PhysicalExprDecodeCtx<'_> {
pub fn schema(&self) -> &Schema;
pub fn decode(&self, node: &PhysicalExprNode) -> Result<Arc<dyn
PhysicalExpr>>;
}
impl Column {
pub fn try_from_proto(
node: &PhysicalColumn,
ctx: &PhysicalExprDecodeCtx<'_>,
) -> Result<Arc<Self>>;
}
```
Whatever \"decode\" needs to do (central match for built-ins,
codec/registry for extensions) is hidden behind one `decode` method on the ctx
— same shape regardless of expression origin.
## Are these changes tested?
The existing `roundtrip_physical_plan` and `roundtrip_physical_expr` tests
cover `Column` round-tripping in both directions, and they exercise the new
`to_proto` / `try_from_proto` path. All 159 proto tests pass. (6 pre-existing
failures require the `parquet-testing` git submodule, unrelated to this change.)
No new tests were added because no new behavior was introduced — the
migrated `Column` produces and consumes the same wire format as before.
## Are there any user-facing changes?
Yes, small API breaks in `datafusion-proto`:
- `protobuf::PhysicalPlanNode::try_from_physical_plan_with_converter` and
`try_into_physical_plan_with_converter` are now methods on the
`PhysicalPlanNodeExt` trait. External callers need `use
datafusion_proto::physical_plan::PhysicalPlanNodeExt;` to continue calling them
with the same syntax. (Two examples in this repo updated.)
- `impl From<...> for protobuf::X` / `impl From<&protobuf::X> for ...`
between proto types and types from `datafusion-common` / `datafusion-expr` /
`datafusion-datasource` are replaced by `FromProto`/`TryFromProto` traits in
`datafusion_proto::convert`. Callers replace `Y::from(...)` / `(&x).into()`
with `Y::from_proto(...)`.
The new `proto` feature on `datafusion-physical-expr-common` and
`datafusion-physical-expr` is **off by default**; crates that don't serialize
plans pay nothing. `datafusion-proto` flips it on.
## Notes for reviewers
- Each commit is independently mergeable and could be split into its own PR
— happy to do that if reviewers prefer. Kept stacked here for higher-level
review of the design as a whole.
- Only `Column` is migrated to the new pattern. Other built-ins (`Literal`,
`BinaryExpr`, etc.) and `DynamicFilterPhysicalExpr` are intentionally left on
the existing downcast/match path; their migration is purely additive in
follow-up PRs and doesn't change wire format.
- The decode side uses a single-method `decode(node)` API on the ctx,
matching the encode side's `to_proto(&ctx)` shape — both built-in and
third-party expressions will look the same to recurse into children.
Third-parties still need `PhysicalExtensionCodec` for the registry / dispatch
role; an inventory-style `register::<T>()` builder is feasible later without
further public-API churn (the ctx hides the dispatch trait, so the
implementation behind it can swap freely).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]