Re: [I] Flink SQL with Iceberg snapshots doesn't react if table has upsert [iceberg]

via GitHub Thu, 14 Mar 2024 00:34:00 -0700


VidakM commented on issue #9948:
URL: https://github.com/apache/iceberg/issues/9948#issuecomment-1996740393


   Thanks for getting back to me.
   
   Looking at the docs I also see that Spark Structured Streaming has a similar 
warning of only supporting appends.
   A bit too bad, as it would unlock a lot of interesting solutions.
   
   Is Flink and Spark append+upsert (delete) streaming not something you want 
to support due to some different preferred way or is it just later in the 
roadmap? If so any idea on timeline?
   
   
   For us this would unlock plenty of interesting use-cases for Iceberg.
   Atm we dump raw events into Iceberg for replay storage, with Flink we also 
unpack them and tidy them a little bit. Call it a silver layer if you feel 
like. 
   
   But if we could subscribe to the more complex upsert silver writes (late 
arrival joins, aggregations, pivots, partial GDPR deletes), we could actually 
automate pipelines that maintain more “golden” tables. For data science teams 
to gain clean generated and maintained tables, SuperSet users better views, 
easier to stream transformed data from Iceberg to other products etc. Now we 
would perhaps need to use Kafka or something as a buffer and do dual writes to 
gain CDC for upsert.
   
   It would also circumvent some of the Flink limitations, such as not needing 
to buffer in memory as much for late arrivals, as it lacks some good 
transformation SQL that Spark has.
   
   We would happily help contribute if possible and given a few pointers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Flink SQL with Iceberg snapshots doesn't react if table has upsert [iceberg]

Reply via email to