GitHub user yjhjstz created a discussion: [Proposal] Enhanced ORCA Parallel
Planning to Align with PostgreSQL Planner
### Proposers
@yjhjstz
### Proposal Status
Under Discussion
### Abstract
ORCA (Pivotal Query Optimizer) currently has limited parallel planning
capabilities, This creates an inconsistency where:
- PostgreSQL planner can generate comprehensive parallel plans
- ORCA lacks equivalent parallel planning sophistication
- Users must disable ORCA (set optimizer=off) to fully utilize parallel
features
### Motivation
Extend ORCA to generate parallel execution plans that align with PostgreSQL's
parallel planning approach, while maintaining
compatibility with Cloudberry's MPP architecture.
### Implementation
Extend ORCA's path generation to create parallel-aware operators:
- CPhysicalParallelSeqScan - Parallel sequential scans
- CPhysicalParallelIndexScan - Parallel index scans
- CPhysicalParallelBitmapHeapScan - Parallel bitmap heap scans
- CPhysicalParallelHash - Parallel hash operations
- CPhysicalParallelHashJoin - Parallel hash joins
- CPhysicalParallelAgg - Parallel aggregation
- CPhysicalParallelSort - Parallel sorting
2. Parallel-Aware Join Planning
Implement parallel join strategies similar to PostgreSQL:
// Parallel hash join with shared hash table
class CPhysicalParallelHashJoin : public CPhysicalHashJoin {
// Enable parallel-aware hash table sharing
// Handle worker coordination for hash table building
// Manage locus for HashedWorkers distribution
};
3. Parallel Cost Model Integration
Enhance ORCA's cost model to account for parallel execution:
- CPU cost reduction based on parallel_workers
- Memory cost adjustments for shared resources
- I/O cost distribution across workers
- Startup cost penalties for worker coordination
4. Parallel Motion Nodes
### Rollout/Adoption Plan
Benefits
1. Performance Consistency - Users get parallel execution regardless of
optimizer choice
2. Feature Parity - ORCA matches PostgreSQL planner capabilities
3. Enhanced Scalability - Better utilization of multi-core systems
4. Simplified Configuration - No need to disable ORCA for parallel workloads
5. Boost tpcds, tpch
### Are you willing to submit a PR?
- [X] Yes I am willing to submit a PR!
GitHub link: https://github.com/apache/cloudberry/discussions/1316
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]