wirybeaver commented on code in PR #20763:
URL: https://github.com/apache/datafusion/pull/20763#discussion_r3135867952
##########
datafusion/expr/src/logical_plan/dml.rs:
##########
@@ -239,6 +239,8 @@ pub enum WriteOp {
Ctas,
/// `TRUNCATE` operation
Truncate,
+ /// `MERGE INTO` operation
+ MergeInto(MergeIntoOp),
Review Comment:
Done in the amended commit. Extended `DmlNode` with a `MERGE_INTO` type tag
and an optional `MergeIntoOpNode` payload (carrying `on`, clauses, per-clause
predicates, and Update/Insert/Delete actions with their embedded `Expr`s).
Wired through `to_proto`/`from_proto` and added a `pub fn
parse_write_op(&DmlNode, registry, codec)` helper that the
`LogicalPlanType::Dml` deserializer uses. The tag-only `From<dml_node::Type>
for WriteOp` impl is kept for back-compat with a documented `unreachable!` arm
for `MergeInto` — callers must use `parse_write_op`. Round-trip test added in
`roundtrip_logical_plan_dml_merge_into`.
##########
datafusion/expr/src/logical_plan/dml.rs:
##########
@@ -291,6 +294,62 @@ impl Display for InsertOp {
}
}
+/// Describes a MERGE INTO operation's parameters.
+///
+/// This is carried inside `WriteOp::MergeInto` and contains
+/// the ON condition and WHEN clauses that the TableProvider
+/// needs to execute the merge.
+#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
+pub struct MergeIntoOp {
+ /// The join condition from `ON <expr>`.
+ /// Kept as a general logical Expr; downstream providers
+ /// (e.g., Iceberg) can decompose into column pairs if needed.
+ pub on: Expr,
+ /// The WHEN clauses, in the order they appeared in the SQL.
+ pub clauses: Vec<MergeIntoClause>,
+}
+
+/// A single WHEN clause within a MERGE INTO statement.
+#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
+pub struct MergeIntoClause {
+ /// Whether this fires on matched or unmatched rows.
+ pub kind: MergeIntoClauseKind,
+ /// Optional additional predicate (`AND <expr>`).
+ pub predicate: Option<Expr>,
+ /// The action to take.
Review Comment:
Ack — will wire the `on` Expr and per-clause `predicate` / `action` Exprs
through `apply_expressions` and `with_new_exprs` in the planner PR (task 2 of
#20746). Noting it here so it doesn't get lost.
##########
datafusion/expr/src/logical_plan/dml.rs:
##########
@@ -291,6 +294,62 @@ impl Display for InsertOp {
}
}
+/// Describes a MERGE INTO operation's parameters.
+///
+/// This is carried inside `WriteOp::MergeInto` and contains
+/// the ON condition and WHEN clauses that the TableProvider
+/// needs to execute the merge.
+#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
+pub struct MergeIntoOp {
+ /// The join condition from `ON <expr>`.
+ /// Kept as a general logical Expr; downstream providers
+ /// (e.g., Iceberg) can decompose into column pairs if needed.
+ pub on: Expr,
+ /// The WHEN clauses, in the order they appeared in the SQL.
+ pub clauses: Vec<MergeIntoClause>,
+}
+
+/// A single WHEN clause within a MERGE INTO statement.
+#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
+pub struct MergeIntoClause {
+ /// Whether this fires on matched or unmatched rows.
+ pub kind: MergeIntoClauseKind,
+ /// Optional additional predicate (`AND <expr>`).
+ pub predicate: Option<Expr>,
+ /// The action to take.
+ pub action: MergeIntoAction,
+}
+
+/// Which rows a MERGE WHEN clause applies to.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Hash)]
+pub enum MergeIntoClauseKind {
+ /// WHEN MATCHED
+ Matched,
+ /// WHEN NOT MATCHED (synonymous with NOT MATCHED BY TARGET)
+ NotMatched,
+ /// WHEN NOT MATCHED BY TARGET
+ NotMatchedByTarget,
+ /// WHEN NOT MATCHED BY SOURCE
+ NotMatchedBySource,
Review Comment:
Kept both variants and added a type-level doc on `MergeIntoClauseKind`
calling out that they are semantically identical (source row with no matching
target row) — `NotMatched` is the SQL standard short form
(Snowflake/Postgres/SQL Server), `NotMatchedByTarget` is BigQuery's explicit
form. The doc states downstream consumers (planners, table providers,
optimizers) MUST treat them identically. Mirroring sqlparser keeps the parsed
SQL spelling round-tripping losslessly. Per-variant docs cross-link to the
type-level note. Happy to revisit and collapse to a single canonical variant if
you'd prefer that — just wanted to keep this PR scoped to type defs + proto
wiring.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]