From: Glenn Miles <mil...@linux.ibm.com> The current xive algorithm for finding a matching group vCPU target always uses the first vCPU found. And, since it always starts the search with thread 0 of a core, thread 0 is almost always used to handle group interrupts. This can lead to additional interrupt latency and poor performance for interrupt intensive work loads.
Changing this to use a simple round-robin algorithm for deciding which thread number to use when starting a search, which leads to a more distributed use of threads for handling group interrupts. [npiggin: Also round-robin among threads, not just cores] Signed-off-by: Glenn Miles <mil...@linux.ibm.com> Reviewed-by: Nicholas Piggin <npig...@gmail.com> Reviewed-by: Glenn Miles <mil...@linux.ibm.com> Reviewed-by: Michael Kowal <ko...@linux.ibm.com> Reviewed-by: Caleb Schlossin <cal...@linux.ibm.com> Tested-by: Gautam Menghani <gau...@linux.ibm.com> Link: https://lore.kernel.org/qemu-devel/20250512031100.439842-9-npig...@gmail.com Signed-off-by: Cédric Le Goater <c...@redhat.com> --- hw/intc/pnv_xive2.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c index ec247ce48ff7..25dc8a372d2f 100644 --- a/hw/intc/pnv_xive2.c +++ b/hw/intc/pnv_xive2.c @@ -643,13 +643,18 @@ static int pnv_xive2_match_nvt(XivePresenter *xptr, uint8_t format, int i, j; bool gen1_tima_os = xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS; + static int next_start_core; + static int next_start_thread; + int start_core = next_start_core; + int start_thread = next_start_thread; for (i = 0; i < chip->nr_cores; i++) { - PnvCore *pc = chip->cores[i]; + PnvCore *pc = chip->cores[(i + start_core) % chip->nr_cores]; CPUCore *cc = CPU_CORE(pc); for (j = 0; j < cc->nr_threads; j++) { - PowerPCCPU *cpu = pc->threads[j]; + /* Start search for match with different thread each call */ + PowerPCCPU *cpu = pc->threads[(j + start_thread) % cc->nr_threads]; XiveTCTX *tctx; int ring; @@ -694,6 +699,15 @@ static int pnv_xive2_match_nvt(XivePresenter *xptr, uint8_t format, if (!match->tctx) { match->ring = ring; match->tctx = tctx; + + next_start_thread = j + start_thread + 1; + if (next_start_thread >= cc->nr_threads) { + next_start_thread = 0; + next_start_core = i + start_core + 1; + if (next_start_core >= chip->nr_cores) { + next_start_core = 0; + } + } } count++; } -- 2.50.1