Igor Mammedov <imamm...@redhat.com> writes: > On Thu, 23 Jan 2025 10:52:15 +0000 > Alex Bennée <alex.ben...@linaro.org> wrote: > >> Igor Mammedov <imamm...@redhat.com> writes: >> >> > QEMU will crash with following debug enabled >> > # define DEBUG_TLB_GATE 1 >> > # define DEBUG_TLB_LOG_GATE 1 >> > due to [1] introduced assert and as it >> > happenstlb_flush_by_mmuidx[_async_work] >> > functions are called not only from vcpu thread but also from reset handler >> > that is called from main thread at cpu realize time when vcpu is already >> > created >> > x86_cpu_new -> ... -> >> > x86_cpu_realizefn -> cpu_reset -> ... -> >> > tcg_cpu_reset_hold >> > >> > drop assert to fix crash. >> >> Hmm the assert is there for a good reason because we do not want to be >> flushing another CPUs state. However the assert itself: >> >> g_assert(!(cpu)->created || qemu_cpu_is_self(cpu)); >> >> was trying to account for pre-initialised vCPUs. What has changed? >> >> cpu_thread_signal_created(cpu) is called just before we start running >> the main loop in mttcg_cpu_thread_fn. So any other thread messing with >> the CPUs TLB can potentially mess things up. > > it reproduces on current master, so yes it likely has changed over time. > I've just stumbled on it when attempting to get rid of cpu->created > usage.
Why the drive to get rid of cpu->created? I guess we could assert: g_assert(!current_cpu || qemu_cpu_is_self(cpu); as current_cpu should only be set as we go into the main thread. However there is a sketchy setting of current_cpu in cpu_exec() that I'm not sure should be there. Also do_run_on_cpu() messes with current_cpu in a way I don't fully understand either. > > >> > 1) >> > Fixes: f0aff0f124028 ("cputlb: add assert_cpu_is_self checks") >> > Signed-off-by: Igor Mammedov <imamm...@redhat.com> >> > --- >> > accel/tcg/cputlb.c | 4 ---- >> > 1 file changed, 4 deletions(-) >> > >> > diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c >> > index b26c0e088f..2da803103c 100644 >> > --- a/accel/tcg/cputlb.c >> > +++ b/accel/tcg/cputlb.c >> > @@ -381,8 +381,6 @@ static void tlb_flush_by_mmuidx_async_work(CPUState >> > *cpu, run_on_cpu_data data) >> > uint16_t all_dirty, work, to_clean; >> > int64_t now = get_clock_realtime(); >> > >> > - assert_cpu_is_self(cpu); >> > - >> > tlb_debug("mmu_idx:0x%04" PRIx16 "\n", asked); >> > >> > qemu_spin_lock(&cpu->neg.tlb.c.lock); >> > @@ -419,8 +417,6 @@ void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t >> > idxmap) >> > { >> > tlb_debug("mmu_idx: 0x%" PRIx16 "\n", idxmap); >> > >> > - assert_cpu_is_self(cpu); >> > - >> > tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(idxmap)); >> > } >> -- Alex Bennée Virtualisation Tech Lead @ Linaro