David Gibson <da...@gibson.dropbear.id.au> writes: > On Thu, Feb 24, 2022 at 03:58:14PM -0300, Fabiano Rosas wrote: >> These two were not migrated so the remote end was starting with the >> decrementer expired. >> >> I am seeing less frequent crashes with this patch (tested with -smp 4 >> and -smp 32). It certainly doesn't fix all issues but it looks like it >> helps. >> >> Signed-off-by: Fabiano Rosas <faro...@linux.ibm.com> > > Nack. This completely breaks migration compatibility for all ppc > machines. In order to maintain compatibility as Richard says new info > has to go into a subsection, with a needed function that allows > migration of existing machine types both to and from a new qemu > version.
Ok, I'm still learning the tenets of migration. I'll give more thought to that in the next versions. > > There are also some other problems. > >> --- >> target/ppc/machine.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/target/ppc/machine.c b/target/ppc/machine.c >> index 1b63146ed1..7ee1984500 100644 >> --- a/target/ppc/machine.c >> +++ b/target/ppc/machine.c >> @@ -9,6 +9,7 @@ >> #include "qemu/main-loop.h" >> #include "kvm_ppc.h" >> #include "power8-pmu.h" >> +#include "hw/ppc/ppc.h" >> >> static void post_load_update_msr(CPUPPCState *env) >> { >> @@ -666,6 +667,18 @@ static const VMStateDescription vmstate_compat = { >> } >> }; >> >> +static const VMStateDescription vmstate_tb_env = { >> + .name = "cpu/env/tb_env", > > Because spapr requires that all cpus have synchronized timebases, we > migrate the timebase state from vmstate_spapr, not from each cpu > individually, to make sure it stays synchronized across migration. If > that's not working right, that's a bug, but it needs to be fixed > there, not just plastered over with extra information transmitted at > cpu level. Ok, so it that what guest_timebase is about? We shouldn't need to migrate DECR individually then. >> + .version_id = 1, >> + .minimum_version_id = 1, >> + .fields = (VMStateField[]) { >> + VMSTATE_INT64(tb_offset, ppc_tb_t), > > tb_offset isn't a good thing to put directly in the migration stream. > tb_offset has kind of non-obvious semantics to begin with: when we're > dealing with TCG (which is your explicit case), there is no host TB, > so what's this an offset from? That's why vmstate_ppc_timebase uses > an explicit guest timebase value matched with a time of day in real > units. Again, if there's a bug, that needs fixing there. This should be in patch 4, but tb_offset is needed for the nested case. When we enter the nested guest we increment tb_offset with nested_tb_offset and decrement it when leaving the nested guest. So tb_offset needs to carry this value over. But maybe we could alternatively modify the nested code to just zero tb_offset when leaving the guest instead? We could then remove nested_tb_offset altogether. >> + VMSTATE_UINT64(decr_next, ppc_tb_t), >> + VMSTATE_TIMER_PTR(decr_timer, ppc_tb_t), > > You're attempting to migrate decr_next and decr_timer, but not the > actual DECR value, which makes me suspect that *is* being migrated. > In which case you should be able to reconstruct the next tick and > timer state in a post_load function on receive. If that's buggy, it > needs to be fixed there. There isn't any "actual DECR" when running TCG, is there? My understanding is that it is embedded in the QEMUTimer. Mark mentioned this years ago: "I can't see anything in __cpu_ppc_store_decr() that updates the spr[SPR_DECR] value when the decrementer register is changed" https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg01114.html Your answer was along the lines of: "we should be reconstructing the decrementer on the destination based on an offset from the timebase" https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg01373.html So if I'm getting this right, in TCG we don't touch SPR_DECR because we only effectively care about what is in the decr_timer->expire_time. And in KVM we don't migrate DECR explicitly because we use the tb_offset and timebase_save/timebase_load to ensure the tb_offset in the destination has the correct value. But timebase_save/timebase_load are not used for TCG currently. So there would be nothing transfering the current decr value to the other side. >> + VMSTATE_END_OF_LIST() >> + } >> +}; >> + >> const VMStateDescription vmstate_ppc_cpu = { >> .name = "cpu", >> .version_id = 5, >> @@ -696,12 +709,16 @@ const VMStateDescription vmstate_ppc_cpu = { >> /* Backward compatible internal state */ >> VMSTATE_UINTTL(env.hflags_compat_nmsr, PowerPCCPU), >> >> + VMSTATE_STRUCT_POINTER_V(env.tb_env, PowerPCCPU, 1, >> + vmstate_tb_env, ppc_tb_t), >> + >> /* Sanity checking */ >> VMSTATE_UINTTL_TEST(mig_msr_mask, PowerPCCPU, >> cpu_pre_2_8_migration), >> VMSTATE_UINT64_TEST(mig_insns_flags, PowerPCCPU, >> cpu_pre_2_8_migration), >> VMSTATE_UINT64_TEST(mig_insns_flags2, PowerPCCPU, >> cpu_pre_2_8_migration), >> VMSTATE_UINT32_TEST(mig_nb_BATs, PowerPCCPU, cpu_pre_2_8_migration), >> + >> VMSTATE_END_OF_LIST() >> }, >> .subsections = (const VMStateDescription*[]) {