luoyuxia opened a new issue, #2962: URL: https://github.com/apache/fluss/issues/2962
## Search before asking - [x] I searched in the [issues](https://github.com/apache/fluss/issues) and found nothing similar. ## Description This issue tracks the tiering-source part of splitting parent task #437. Today, tiering source reads Fluss log data and converts it into downstream storage formats through row-oriented or storage-specific paths. To support a cleaner and more efficient Arrow-based pipeline, tiering source should be able to read data directly as Arrow . This work would provide a reusable Arrow-native read path for tiering, and would also serve as the foundation for directly writing tiered data into Parquet in a later step. Possible scope: - add a tiering-source path that reads log data as Arrow ; - define the batch lifecycle/ownership clearly to avoid Arrow memory leaks; - make the Arrow batch path reusable by downstream tiering writers. This is intended to be one sub-task of #437, while the Arrow-to-Parquet conversion itself is tracked separately. ## Willingness to contribute - [ ] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
