ForeverAngry opened a new pull request, #2434:
URL: https://github.com/apache/iceberg-python/pull/2434

   <!--
   Thanks for opening a pull request!
   -->
   
   <!-- In the case this PR will resolve an issue, please replace 
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
   <!-- No specific GitHub issue - this is a new feature enhancement -->
   
   # Rationale for this change
   
   This PR adds comprehensive branch merge strategies to PyIceberg, bringing 
Git-like branch merging capabilities to Iceberg table operations. This 
enhancement enables users to merge branches with different strategies depending 
on their workflow needs.
   
   **Feature Overview:**
   Apache Iceberg supports branch operations (create, delete, tag), but lacked 
merge capabilities between branches. This PR implements 5 standard merge 
strategies commonly used in version control systems:
   
   1. **MERGE**: Classic three-way merge creating a merge commit that preserves 
history of both branches
   2. **SQUASH**: Condenses all commits from source branch into a single clean 
commit on target branch  
   3. **REBASE**: Creates linear history by replaying commits from source 
branch on top of target branch
   4. **CHERRY_PICK**: Selects and applies specific individual commits from one 
branch to another
   5. **FAST_FORWARD**: Moves target branch pointer forward when no divergent 
commits exist (no merge commit needed)
   
   **Implementation Details:**
   - **Strategy Pattern**: Clean, extensible architecture with abstract base 
class and concrete implementations
   - **Automatic Detection**: Fast-forward possibility automatically detected 
and validated
   - **Robust Utilities**: Common ancestor finding, branch validation, and 
snapshot traversal utilities
   - **Flexible API**: Optional source branch deletion after successful merge
   - **Error Handling**: Comprehensive validation with clear error messages for 
invalid operations
   
   **Use Cases:**
   - **Development Workflows**: Feature branch integration with different merge 
policies
   - **Data Pipeline Management**: Merging experimental data processing 
branches back to production
   - **Schema Evolution**: Combining schema changes from different development 
branches
   - **Multi-tenant Environments**: Merging tenant-specific changes while 
maintaining isolation
   
   ## Are these changes tested?
   
   Yes, extremely comprehensive test coverage with **35 tests** across multiple 
categories:
   
   **Core Functionality Tests:**
   - **Strategy Implementation**: All 5 merge strategies individually tested 
with various scenarios
   - **Utility Methods**: Common ancestor finding, fast-forward detection, 
branch validation
   - **Integration Tests**: End-to-end testing through ManageSnapshots API
   
   **Edge Cases & Error Handling:**
   - Missing snapshots and branches with proper error messages
   - Circular reference detection (prevents infinite loops)
   - Self-merge prevention and validation
   - Empty/invalid branch names
   - No common ancestor scenarios
   
   **Behavioral Validation:**
   - Fast-forward validation specific to FastForwardStrategy
   - Strategy selection and factory method testing
   - Source branch preservation vs. deletion options
   - Consistent return types across all strategies
   
   **Quality Assurance:**
   - All existing functionality preserved (no regressions)
   - Comprehensive type hints and documentation
   - All linting checks pass (ruff, mypy, pydocstyle)
   - Mock-based testing for isolation and reliability
   
   ## Are there any user-facing changes?
   
   **Yes - New Feature Addition (No Breaking Changes)**
   
   **New Public API:**
   ```python
   from pyiceberg.table.update.snapshot import BranchMergeStrategy
   
   # New enum with 5 merge strategies
   BranchMergeStrategy.MERGE
   BranchMergeStrategy.SQUASH  
   BranchMergeStrategy.REBASE
   BranchMergeStrategy.CHERRY_PICK
   BranchMergeStrategy.FAST_FORWARD
   
   # New method on ManageSnapshots
   table.manage_snapshots().merge_branch(
       source_branch="feature",
       target_branch="main", 
       strategy=BranchMergeStrategy.SQUASH,
       delete_source_branch=False  # Optional: preserve or delete source branch
   ).commit()


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to