Hi everyone, Thank you all for your valuable feedback and thoughtful discussion regarding my proposal. I appreciate the time and input from the community.
As the discussion has drawn to a close, I will proceed to initiate a formal [VOTE] thread soon to move this proposal forward. Thanks again for your participation and support! Best, Xixi Chen 陈xx <[email protected]> 于2025年7月23日周三 21:34写道: > Thanks for your valuable feedback. I agree with your suggestions. > > In practice, when there is only one pending table in the system, it is > indeed reasonable to break the quota limit for that table to improve > overall optimizing efficiency. However, since we cannot predict whether > additional tables will enter the Pending state in the future, it is better > to make the option to whether allow exceeding the quota limit configurable. > > The procedure for optimizer thread to poll a task is as follows: each > thread, following the scheduling order, selects the table with the highest > scheduling priority from the tableQueue and polls optimizing tasks from the > taskQueue of that table’s optimizing process. If the selected table has > reached its optimizing resource limit, the thread proceeds to the next > pending table. > > In a scenario where the number of the pending optimizing tasks of a > table exceeds its quota, but there are still idle optimizer resources > available. In this case, some optimizing threads may traverse all pending > tables and pull a null task, even though there are still pending > optimizing tasks. This occurs because the tables to which these tasks > belong have already reached their resource quota limits. > > To address this issue, when quota overcommitment is enabled, the system > can allow idle optimizer threads to break the per-table quota limit and > execute the optimizing task. This approach ensures that, in scenarios where > there is only a single pending table or where certain tables have low quota > settings, all optimizer resources can be fully utilized. > > Thank you for pointing this out. I recognize the importance of your > suggestion and will make the appropriate changes. > > Best, > Xixi Chen > > Qishang Zhong <[email protected]> 于2025年7月23日周三 17:08写道: > >> Hi. >> >> Thanks for starting this thread. >> >> I agree with Jinsong's point of view, a percentage might be more >> appropriate. >> >> In one case, I have only one pending table, and I have generated tasks >> that >> exceed the quota. Can I break the quota limit? >> >> Best, >> Qishang Zhong >> >> 陈xx <[email protected]> 于2025年7月22日周二 19:14写道: >> >> > Thanks for your suggestion. >> > >> > Indeed, if we set the default quota for tables to a fixed value, it is >> > still possible to lead to a situation where a single table monopolizes >> all >> > optimizer resources when there are very few optimizer resources. >> > Conversely, when optimizer resources are abundant, it may result in >> > consistently low optimizing efficiency for the table having multiple >> > self-optimizing tasks, and cause inefficient utilization of optimizer >> > resources. Therefore, setting the quota as a percentage to all available >> > optimizer resources is a more appropriate approach. I will proceed with >> the >> > corresponding modifications accordingly. >> > >> > Best, >> > XixiChen >> > >> > Jinsong Zhou <[email protected]> 于2025年7月22日周二 18:01写道: >> > >> > > Hi, >> > > >> > > Thanks for bringing up this improvement. >> > > >> > > This improvement is indeed valuable. In my production practice, we >> once >> > > encountered a situation where a single table suddenly consumed all >> > > resources, causing all other tables to enter a pending status. We need >> > the >> > > capability to limit the maximum resources a single table can use. >> > > >> > > However, I'd like to discuss how to set the default quota for tables. >> I >> > > believe in most cases, users won't configure individual quotas for >> each >> > > table, making default quotas particularly important. Rather than >> using a >> > > fixed value, a percentage might be more appropriate - for example, >> 50%, >> > > indicating a table can only consume half of the entire group's >> resources. >> > > >> > > Best, >> > > Jinsong >> > > >> > > >> > > >> > > On Tue, Jul 22, 2025 at 5:52 PM 陈xx <[email protected]> wrote: >> > > >> > > > Hi devs: >> > > > >> > > > We would like to start a discussion about AIP-1: Optimizing >> Allocation >> > > and >> > > > Schedule Priority of Optimizer resources for Tables[1]. >> > > > >> > > > An optimizer group comprises a collection of optimizers, where each >> > > > optimizer instance typically contains multiple threads, with each >> > > optimizer >> > > > thread responsible for executing a single optimizing task. When >> > multiple >> > > > self-optimizing tasks are pending and optimizer resources are >> limited, >> > > > tasks originating from the same table may monopolize all available >> > > > resources in the absence of proper constraints. >> > > > So, we propose to optimize allocation and schedule priority of >> > > > optimizer resources for tables. >> > > > Looking forward to hearing from you. >> > > > >> > > > [1] https://cwiki.apache.org/confluence/x/bQ5JFg >> > > > >> > > > Best >> > > > XixiChen >> > > > >> > > >> > >> >
