Currently, Flink's capability to unon read data in datalake and Fluss is tightly coupled with Paimon's implementation, which limits it's flexibility and extensibility. We hard code paimon related classes in fluss-flink module. It makes it difficult for Flink to support union read other datalakes and for other compute engines like spark, trino to integate with the union read ability. What's more, the tight coupling also obscures the core logic of union read , making the code harder to maintain and evolve.
To address this , I’d like to propose FIP-6: Decouple Flink union read with paimon[1], which seeks to decouple union read from Paimon by introuding well-defiend interfaces and extension points which paimon should implement. By doing so, Flink can support a wider range of datalakes. Furthermore, the standardized interfaces will allow other compute engines to integrate with Fluss's union read capability. Welcome your feedback and suggestions on this proposal. Looking forward to a productive discussion! [1]: https://cwiki.apache.org/confluence/display/FLUSS/FIP-6%3A+Decouple+Flink+union+read+with+paimon Best regards, Yuxia
