fresh-borzoni commented on code in PR #2934:
URL: https://github.com/apache/fluss/pull/2934#discussion_r2996303793


##########
website/blog/2026-03-25-fluss-rust-sdk.md:
##########
@@ -0,0 +1,149 @@
+---
+slug: fluss-rust-sdk
+title: "Why Apache Fluss Chose Rust for Its Multi-Language SDK"
+authors: [yuxia, keithlee, anton]
+image: ./assets/fluss_rust/banner.jpg
+---
+
+![Banner](assets/fluss_rust/banner.jpg)
+
+If you maintain a data system that only speaks Java, you will eventually hear 
from someone who doesn't. A Python team building a feature store. A C++ service 
that needs sub-millisecond writes. An AI agent that wants to call your system 
through a tool binding. They all need the same capabilities (writes, reads, 
lookups) and none of them want to spin up a JVM to get them.
+
+Apache Fluss, streaming storage for real-time analytics and AI, hit this exact 
inflection point. The [Java client](/blog/fluss-java-client) works well for 
Flink-based compute, where the JVM is already the world you live in. But 
outside that world, asking consumers to run a JVM sidecar just to write a 
record or look up a key creates friction that compounds across every service, 
every pipeline, every agent in the stack.
+
+We could have written a separate client for each language. Maintain five 
copies of the wire protocol, five implementations of the batching logic, five 
sets of retry semantics and idempotence tracking. That path scales linearly 
with languages and ends predictably: the Java client gets features first, the 
Python client gets them six months later with slightly different edge-case 
behavior, and the C++ client is perpetually "almost done."
+
+We took a different path and tried to leverage the lessons of the great.
+
+<!-- truncate -->
+
+## The librdkafka Model
+
+If you've worked with Kafka clients outside of Java, you've probably used 
[librdkafka](https://github.com/confluentinc/librdkafka) without knowing it. 
It's a single C library that powers `confluent-kafka-python`, 
`confluent-kafka-go`, and others. One core handles the wire protocol, batching, 
memory management, and delivery semantics. Each language binding is a thin 
wrapper, a glue on top of a battle-tested engine.
+
+The model is elegant because it inverts the usual maintenance equation. 
Instead of N full client implementations that diverge over time, each 
developing its own bugs, its own subtle behavioral differences, its own backlog 
of features the Java client has but the Python client doesn't yet, you get one 
implementation and N thin bindings that stay in sync by construction. A bug 
gets fixed once, and every language picks it up on the next build.
+
+The deeper benefit is correctness, not just code reuse. When you maintain 
three separate implementations of a client protocol, behavioral drift is 
inevitable. Edge cases in retry logic, subtle differences in how backpressure 
kicks in, inconsistencies in how idempotent writes handle sequence numbers. 
These are the bugs that don't show up in unit tests but surface in production 
under load, and they surface differently in each language.
+
+We built fluss-rust on this same idea. A single Rust core implements the full 
Fluss client protocol (Protobuf-based RPC, record batching with backpressure, 
background I/O, Arrow serialization, idempotent writes, SASL authentication) 
and exposes it to three languages:
+
+- **Rust**: directly, as the `fluss-rs` crate
+- **Python**: via [PyO3](https://pyo3.rs), the Rust-Python bridge
+- **C++**: via [CXX](https://cxx.rs), the Rust-C++ bridge
+
+To give a sense of proportion: the Rust core is roughly 40k lines, while the 
Python binding is around 5k and the C++ binding around 6k. The bindings handle 
type conversion, async runtime bridging, and memory ownership at the language 
boundary, but all the protocol logic, batching, Arrow codec, and retry handling 
live in the shared core.

Review Comment:
   Yes, it's good place for diagram and I plan to include some 
visuals/diagrams, but a bit later - if we like the text and flow.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to