+1 (binding) 

DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1

> On Apr 3, 2026, at 2:59 PM, Andreas Neumann <[email protected]> wrote:
> 
> Hi Spark devs,
> 
> I'd like to call a vote on the SPIP: Auto CDC Support for Apache Spark
> 
> Motivation
> With the upcoming introduction of standardized CDC support 
> <https://issues.apache.org/jira/browse/SPARK-55668>, Spark will soon have a 
> unified way to produce change data feeds. However, consuming these feeds and 
> applying them to a target table remains a significant challenge.
> 
> Common patterns like SCD Type 1 (maintaining a 1:1 replica) and SCD Type 2 
> (tracking full change history) often require hand-crafted, complex MERGE 
> logic. In distributed systems, these implementations are frequently 
> error-prone when handling deletions or out-of-order data.
> 
> Proposal
> This SPIP proposes a new "Auto CDC" flow type for Spark. It encapsulates the 
> complex logic for SCD types and out-of-order data, allowing data engineers to 
> configure a declarative flow instead of writing manual MERGE statements. This 
> feature will be available in both Python and SQL.
> 
> Example SQL:
> -- Produce a change feed
> CREATE STREAMING TABLE cdc.users AS
> SELECT * FROM STREAM my_table CHANGES FROM VERSION 10;
> 
> -- Consume the change feed
> CREATE FLOW flow
> AS AUTO CDC INTO
>   target
> FROM stream(cdc_data.users)
>   KEYS (userId)
>   APPLY AS DELETE WHEN operation = "DELETE"
>   SEQUENCE BY sequenceNum
>   COLUMNS * EXCEPT (operation, sequenceNum)
>   STORED AS SCD TYPE 2
>   TRACK HISTORY ON * EXCEPT (city);
> 
> Relevant Links:
> SPIP Document: 
> https://docs.google.com/document/d/1Hp5BGEYJRHbk6J7XUph3bAPZKRQXKOuV1PEaqZMMRoQ/
> Discussion Thread: 
> https://lists.apache.org/thread/j6sj9wo9odgdpgzlxtvhoy7szs0jplf7
> JIRA:  
> <https://issues.apache.org/jira/browse/SPARK-55668>https://issues.apache.org/jira/browse/SPARK-56249
> 
> The vote will be open for at least 72 hours. Please vote:
> 
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don't think this is a good idea because ...
> 
> Cheers -Andreas
> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to