kriti-sc opened a new pull request, #2889:
URL: https://github.com/apache/iggy/pull/2889

   ## Which issue does this PR close?
   
   Closes #1852 
   
   ## Rationale
   
   Delta Lake is a data analytics engine, and very popular in modern streaming 
analytics architectures.
   
   ## What changed?
   
   Introduces a Delta Lake Sink Connector that enables writing data from Iggy 
to Delta Lake.
   
   The Delta Lake writing logic is heavily inspired by the [kafka-delta-ingest 
project](https://github.com/delta-io/kafka-delta-ingest/tree/main), to have a 
proven starting ground for writing to Delta Lake.
   
   ## Local Execution
   1. Produced 32632 messages with schema `user_id: String, user_type: u8, 
email: String, source: String, state: String, created_at: DateTime<Utc>, 
message: String` using [sample data 
producer](https://github.com/apache/iggy/tree/master/examples/rust/src/sink-data-producer).
   2. Consumed messages using the Delta Lake sink and created a Delta table on 
filesystem.
   3. Verified number of rows in delta table and the schema. 
   4. Added unit tests and e2e tests, both passing.
   
   <img width="1409" height="900" alt="image" 
src="https://github.com/user-attachments/assets/57cecc01-aa79-40bc-ae39-5debcc31d7a9";
 />
   Left: messages produced; Right(top): messages consumed by Delta sink; 
Right(bottom): Inspecting Delta table in python
   
   
   
   ## AI Usage
   
   If AI tools were used, please answer:
   1. Which tools? Claude Code
   5. Scope of usage? generated functions
   6. How did you verify the generated code works correctly? Manual testing by 
producing data into Iggy and then running the sink and verifying local Delta 
Lake creation, unit tests and e2e tests for local Delta Lake and Delta Lake on 
S3.
   7. Can you explain every line of the code if asked? Yes
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to