badalprasadsingh opened a new pull request, #524:
URL: https://github.com/apache/iceberg-go/pull/524

   # Partitioned Fanout Writer with Rolling Data File Support (Append Mode)
   
   This PR completes the implementation of partitioned writing with support for 
rolling data files in append mode. It enables efficient, parallelized ingestion 
into partitioned tables while maintaining manifest and snapshot correctness.
   
   **Slack Thread Discussion**: 
[Link](https://apache-iceberg.slack.com/archives/C05J3MJ42BD/p1751002533414969)
   **Proposal Document**: [Google 
Drive](https://drive.google.com/file/d/18CwR9nhwkThs-Q-JZZvisBEaDICvp5Z7/view?usp=drive_link)
   
   
   ### Details
   
   * Introduced parallel processing of `arrow.Record` using a user-defined 
number of goroutines.
   * Each goroutine maintains its own hash table to map partition keys to row 
indices.
   * After partitioning, `compute.Take()` is used to extract per-partition data 
slices.
   * Integrated dedicated rolling writers per partition to manage data file 
size thresholds and output constraints.
   
   
   ### Tests Performed
   
   * [x] Compatible with all partition transforms
   * [x] Handled null values in partition columns
   * [x] Validated compatibility with partition spec evolution
   * [x] Verified correctness for non-linear transformation cases
   * [x] Confirmed schema evolution compatibility
   * [x] Partition pruning verified
   
   ---
   
   @zeroshade — would appreciate your review when you get a chance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to