GitHub user jiaqizho edited a discussion: [Proposal] ORCA: remove the CJobQueue
### Proposers jiaqizho ### Proposal Status Under Discussion ### Abstract About CJobQueue ---- **As I understand it, `CJobQueue` is used to prevent do the same `Cjob`**. In CJob.h:L51 desc: ``` // Job Queue: // Each job maintains a job queue CJob::m_pjq of other identical jobs that // are created while a given job is executing. For example, when exploring // a group, a group exploration job J1 would be executing. Concurrently, // another group exploration job J2 (for the same group) may be triggered // by another worker. The job J2 would be added in a pending state to the // job queue of J1. When J1 terminates, all jobs in its queue are notified // to pick up J1 results. ``` Also let me explain the logic of `JobQueue` in detail. Before we start, let me introduce two type of `CJob` in `CJobQueue` - `Queued Cjob`: the position in `CJobQuque` is >= 1 - `Main CJob`: the position in `CJobQuque` is 0 If the `CJob` being executed is the `Queued Cjob` - Only the same `Cjob` Type(Exploration/Implementation) in the same `Cgroup` will be added to the `CJobQuque` - Whatever the result of `Cjob->FExecute()` is , the `Queued CJob` will always return the false in `CScheduler::FExecute`. - After `CScheduler::FExecute`, the `Queued CJob` will change to the `EjrSuspended` state, until the `Main CJob` release it. - Once `Main CJob` call the `NotifyCompleted`, pop the `Queued CJob` and it will resume the parent. it's parent still push its StateMachine, because the `Main CJob` already do the same work in the `CGroup` If the `CJob` being executed is the `Main Cjob`, it can be executed after StateMachine finished and call the `NotifyCompleted` logic. Git history ---- I looked through the git history of `CJobQuque`. it's added in the first version of ORCA(76feb99efdc92bf2a779d58c656d8894908223ab gpdb). In early versions of ORCA, concurrent execution existed until the commit Remove multi-threading code (#510)(61c7405ac737ce74804d57d8cd6c930219e8b124). Before this PR, `CScheduler` can get the `Queued Cjob`, cause `CScheduler` will use the multi-threading to process the waitting list, and some of job may insert into the same `CJobQueue`. Proposal ---- In current version of CBDB, **the ORCA is single-thread mode**. So the size of each `CJobQuque` is always 1, also in `CScheduler` won't get any `Queued Cjob`. I have verified this pointer in PR(https://github.com/apache/cloudberry/pull/742, Although ORCA icw tests are not enabled in GITHUB CI, I have verified on my own machine that ICW-orca test does not trigger these assert(false)). There are two reasons to remove `CJobQueue`: - `CJobQueue` is dead logic of in current version ORCA. - GP has proven that parallelization in `CSchedule` itself is a non-profitable endeavor. Maybe we should think of other acceleration processes. Impact on cherry-pick GP ---- This content has not been changed in subsequent GP versions. left is CBDB(show 46 search results), right is GPDB(show 46 search results). <img width="698" alt="image" src="https://github.com/user-attachments/assets/ba75b6ee-9dfc-4d62-bf43-3639084465c1"> ### Motivation pass ### Implementation pass ### Rollout/Adoption Plan _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! GitHub link: https://github.com/apache/cloudberry/discussions/743 ---- This is an automatically sent email for dev@cloudberry.apache.org. To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org For additional commands, e-mail: dev-h...@cloudberry.apache.org