stevenzwu commented on PR #7161:
URL: https://github.com/apache/iceberg/pull/7161#issuecomment-2350416841

   @binshuohu Currently, there is no plan to reapply this change to the main 
branch. We have a more general range distribution available now (guided by 
statistics collection): 
https://iceberg.apache.org/docs/nightly/flink-writes/#range-distribution-experimental.
 It is more general than this (bucketing only). Range distribution also handle 
different parallelisms and partitions well.
   
   Range distribution has one disadvantage. It performs statistics collection 
and aggregation to guide the range split. That adds a little overhead. 
Bucketing partitioner here assumes traffic are evenly distributed across 
buckets, which should be true (hash % nBuckets).
   
   cc @pvary 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to