beryllw opened a new issue, #2915: URL: https://github.com/apache/fluss/issues/2915
### Search before asking - [x] I searched in the [issues](https://github.com/apache/fluss/issues) and found nothing similar. ### Motivation # Problem Description When running a Tiering job with high write throughput, the data synchronization cannot keep up with the write speed. The root cause analysis reveals two main issues: 1. Parallelism is bounded by bucket count - Tiering job parallelism is 1:1 mapped to bucket count, limiting scalability 2. Read and write operations cannot be pipelined - Reading from Fluss and writing to Paimon are executed sequentially, resulting in low CPU utilization # Root Cause Analysis 1. Split Granularity Equals Bucket Granularity: Each split covers exactly one bucket, which limits the maximum parallelism. 2. Sequential Read-Write Pattern: The current implementation reads from Fluss and writes to Paimon synchronously. ### Solution _No response_ ### Anything else? _No response_ ### Willingness to contribute - [ ] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
