+1

vaquar khan <[email protected]> 于2026年4月4日周六 09:45写道:

> +1
>
> Regards,
> Viquar Khan
>
> On Sat, 4 Apr 2026 at 11:14, Lisa N. Cao <[email protected]> wrote:
>
>> +1 (non-binding)
>>
>> --
>> LNC
>>
>> On Fri, Apr 3, 2026, 5:15 PM Shixiong Zhu <[email protected]> wrote:
>>
>>> +1
>>>
>>>
>>> On Fri, Apr 3, 2026 at 5:03 PM Mich Talebzadeh <
>>> [email protected]> wrote:
>>>
>>>> +1
>>>>
>>>> Dr Mich Talebzadeh,
>>>> Data Scientist | Distributed Systems (Spark) | Financial Forensics &
>>>> Metadata Analytics | Transaction Reconstruction | Audit & Evidence-Based
>>>> Analytics
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, 3 Apr 2026 at 23:00, Andreas Neumann <[email protected]> wrote:
>>>>
>>>>> Hi Spark devs,
>>>>>
>>>>> I'd like to call a vote on the SPIP*: Auto CDC Support for Apache
>>>>> Spark*
>>>>> Motivation
>>>>>
>>>>> With the upcoming introduction of standardized CDC support
>>>>> <https://issues.apache.org/jira/browse/SPARK-55668>, Spark will soon
>>>>> have a unified way to produce change data feeds. However, consuming these
>>>>> feeds and applying them to a target table remains a significant challenge.
>>>>>
>>>>> Common patterns like SCD Type 1 (maintaining a 1:1 replica) and SCD
>>>>> Type 2 (tracking full change history) often require hand-crafted,
>>>>> complex MERGE logic. In distributed systems, these implementations
>>>>> are frequently error-prone when handling deletions or out-of-order data.
>>>>> Proposal
>>>>>
>>>>> This SPIP proposes a new "Auto CDC" flow type for Spark. It
>>>>> encapsulates the complex logic for SCD types and out-of-order data,
>>>>> allowing data engineers to configure a declarative flow instead of writing
>>>>> manual MERGE statements. This feature will be available in both Python
>>>>> and SQL.
>>>>>
>>>>> Example SQL:
>>>>>
>>>>> -- Produce a change feed
>>>>>
>>>>> CREATE STREAMING TABLE cdc.users AS
>>>>>
>>>>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 10;
>>>>>
>>>>>
>>>>> -- Consume the change feed
>>>>>
>>>>> CREATE FLOW flow
>>>>>
>>>>> AS AUTO CDC INTO
>>>>>
>>>>>   target
>>>>>
>>>>> FROM stream(cdc_data.users)
>>>>>
>>>>>   KEYS (userId)
>>>>>
>>>>>   APPLY AS DELETE WHEN operation = "DELETE"
>>>>>
>>>>>   SEQUENCE BY sequenceNum
>>>>>
>>>>>   COLUMNS * EXCEPT (operation, sequenceNum)
>>>>>
>>>>>   STORED AS SCD TYPE 2
>>>>>
>>>>>   TRACK HISTORY ON * EXCEPT (city);
>>>>>
>>>>>
>>>>> *Relevant Links:*
>>>>>
>>>>>    - SPIP Document:
>>>>>    
>>>>> https://docs.google.com/document/d/1Hp5BGEYJRHbk6J7XUph3bAPZKRQXKOuV1PEaqZMMRoQ/
>>>>>    -
>>>>>
>>>>>    *Discussion Thread: *
>>>>>    https://lists.apache.org/thread/j6sj9wo9odgdpgzlxtvhoy7szs0jplf7
>>>>>    -
>>>>>
>>>>>    JIRA: <https://issues.apache.org/jira/browse/SPARK-55668>
>>>>>    https://issues.apache.org/jira/browse/SPARK-56249
>>>>>
>>>>> *The vote will be open for at least 72 hours. *Please vote:
>>>>>
>>>>> [ ] +1: Accept the proposal as an official SPIP
>>>>> [ ] +0
>>>>> [ ] -1: I don't think this is a good idea because ...
>>>>> Cheers -Andreas
>>>>>
>>>>>
>>>>>

Reply via email to