+1 (binding) DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1
> On Apr 3, 2026, at 2:59 PM, Andreas Neumann <[email protected]> wrote: > > Hi Spark devs, > > I'd like to call a vote on the SPIP: Auto CDC Support for Apache Spark > > Motivation > With the upcoming introduction of standardized CDC support > <https://issues.apache.org/jira/browse/SPARK-55668>, Spark will soon have a > unified way to produce change data feeds. However, consuming these feeds and > applying them to a target table remains a significant challenge. > > Common patterns like SCD Type 1 (maintaining a 1:1 replica) and SCD Type 2 > (tracking full change history) often require hand-crafted, complex MERGE > logic. In distributed systems, these implementations are frequently > error-prone when handling deletions or out-of-order data. > > Proposal > This SPIP proposes a new "Auto CDC" flow type for Spark. It encapsulates the > complex logic for SCD types and out-of-order data, allowing data engineers to > configure a declarative flow instead of writing manual MERGE statements. This > feature will be available in both Python and SQL. > > Example SQL: > -- Produce a change feed > CREATE STREAMING TABLE cdc.users AS > SELECT * FROM STREAM my_table CHANGES FROM VERSION 10; > > -- Consume the change feed > CREATE FLOW flow > AS AUTO CDC INTO > target > FROM stream(cdc_data.users) > KEYS (userId) > APPLY AS DELETE WHEN operation = "DELETE" > SEQUENCE BY sequenceNum > COLUMNS * EXCEPT (operation, sequenceNum) > STORED AS SCD TYPE 2 > TRACK HISTORY ON * EXCEPT (city); > > Relevant Links: > SPIP Document: > https://docs.google.com/document/d/1Hp5BGEYJRHbk6J7XUph3bAPZKRQXKOuV1PEaqZMMRoQ/ > Discussion Thread: > https://lists.apache.org/thread/j6sj9wo9odgdpgzlxtvhoy7szs0jplf7 > JIRA: > <https://issues.apache.org/jira/browse/SPARK-55668>https://issues.apache.org/jira/browse/SPARK-56249 > > The vote will be open for at least 72 hours. Please vote: > > [ ] +1: Accept the proposal as an official SPIP > [ ] +0 > [ ] -1: I don't think this is a good idea because ... > > Cheers -Andreas > >
signature.asc
Description: Message signed with OpenPGP
