Re: [I] Is there any way on Flink to read newly appended data only (NOT in current Iceberg table snapshot)? [iceberg]

via GitHub Wed, 13 Mar 2024 21:03:57 -0700


dyzcs commented on issue #9955:
URL: https://github.com/apache/iceberg/issues/9955#issuecomment-1996358615


   Hi @okayhooni regarding this issue, I would like to express my opinion. I am 
currently exploring the use of Iceberg as a replacement for Kafka in **near 
real-time** processing. The problem you mentioned is divided into two parts.
   
   The first part is that Iceberg does not maintain offsets for messages like 
Kafka, but there is an internal snapshot list in chronological order. You can 
obtain the latest snapshot ID each time and then specify incremental 
consumption from the current ID.
   
   The second part is Iceberg to satisfy the problem of message order in Kafka 
as much as possible. Iceberg consumes snapshots one by one in FIFO mode, and 
sorts the read incremental files according to their write time before passing 
them downstream.
   
   The above is my opinion, welcome to discuss together.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Is there any way on Flink to read newly appended data only (NOT in current Iceberg table snapshot)? [iceberg]

Reply via email to