mxm opened a new pull request, #12424:
URL: https://github.com/apache/iceberg/pull/12424

   The Flink Iceberg sink allows writing data from a continuous Flink stream to 
an Iceberg table. It already powers countless production jobs and has passed 
the test of time. But for many users, the static nature of the Flink Iceberg 
sink became a serious limitation. The target table, its schema and partitioning 
spec, and even the write parallelism are statically configured. Any changes to 
these require rebuilding and restarting the underlying Flink job.
   
   We present the next generation Flink Iceberg sink, the Dynamic Flink Iceberg 
Sink.
   
   From the user perspective, there are three main advantages to using the 
Dynamic Iceberg sink. It supports:
   
   1. Writing to any number of tables (No more 1:1 sink/table relationship).
   2. Dynamically creating and updating tables based on a user-supplied routing.
   3. Dynamically updating the schema and partition spec of tables.
   
   All of this is done from within the sink, controlled by the user via the 
DynamicRecord class, without any Flink job restart.
   
   Design document: 
https://docs.google.com/document/d/1R3NZmi65S4lwnmNjH4gLCuXZbgvZV5GNrQKJ5NYdO9s/edit
   
   Major credits to @pvary who came up with the design and wrote the initial 
implementation.
   
   We have done some benchmarking to compare the old Iceberg sink with the 
Dynamic Iceberg sink. We found that the performance overhead of the Dynamic 
Sink is neglectable (see the results below). Of course, the real power of the 
Dynamic Sink unfolds for writing to multiple tables which is not supported in 
the regular Iceberg sink.
   
   ```
   TestDynamicIcebergSinkPerf.testIcebergSink
   
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 1119211093289016971 written 5000000 records in 9405 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 3781801840532551473 written 5000000 records in 8948 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 3033420860172241073 written 5000000 records in 8555 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 2032850461702063007 written 5000000 records in 8781 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 7558061307456947485 written 5000000 records in 8872 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 601297722355936169 written 5000000 records in 8330 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 6249065641929321708 written 5000000 records in 8512 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 2068854608451193539 written 5000000 records in 8351 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 2246238791649680214 written 5000000 records in 8433 ms
   
   TestDynamicIcebergSinkPerf.testDynamicSink
   
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 6885587612182426678 written 5000000 records in 9862 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 5143658647354768578 written 5000000 records in 9611 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 6787994330850859040 written 5000000 records in 8935 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 6788803513469485538 written 5000000 records in 8923 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 8592131276344399796 written 5000000 records in 9335 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 4567624953033955114 written 5000000 records in 8973 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 139484213775208428 written 5000000 records in 9275 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 5103930001749514761 written 5000000 records in 8982 ms
   [main] INFO org.apache.iceberg.flink.sink.dynamic.TestDynamicIcebergSinkPref 
- TEST RESULT: Snapshot 4174643942796111771 written 5000000 records in 9209 ms
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to