Hi all,
I'm still trying to make my kvm dependent setup to work, and any help would be
very much appreciated.
TL;DR; after restoring from arm kvm checkpoints simulation never advances. I'd
also like to accelerate gem5 simulation with arm kvm but generate checkpoints
with atomic, so I could restore them in machines where arm kvm is not available
(x86 servers), but this also does not work.
First, a brief comment on previous help (to make kvm boot work): I was
comparing my work with the stable (and master) branch(es), but not with the
develop which had all the modifications Giacomo mentioned. So now I paired my
repo with the develop, and kvm boot with 8 cores worked out-of-the-box, and it
boots much faster than any gem5 models. So I discarded the modifications I did
and stick with the develop branch to avoid introducing new errors (even though
my modifications were also working).
However, I'm struggling to leverage KVM for checkpointing, because simulation
never advances when restoring from a kvm checkpoint.
When using fs.py with --restore-with-cpu ArmV8KvmCPU --cpu-type ArmV8KvmCPU
flags, the checkpoint is restored but I see no progress in
output_folder/system.terminal and also gem5 never exits. Seems like the
simulation gets stuck. (This setup actually does not matter because I need to
restore from kvm checkpoint to a gem5 model, not from kvm to kvm, but just
reporting this test I did in case it is useful.)
The same "stuck simulation" thing happens if I use --restore-with-cpu
ArmV8KvmCPU --cpu-type AtomicSimpleCPU. In this case, I also activated the
--debug-flag=Exec and observed the code gets stuck at the
"_raw_spin_lock_irqsave" method from the kernel. (By stuck I mean, more than 3
hours without reporting any new info from the debug-flags). Not sure what
causes this.
Alternatively, I tried also switching CPUs from kvm to AtomicSimpleCPU right
before creating the checkpoint. Since I had successfully used AtomicSimpleCPU
to boot gem5 generated/restored checkpoints in the past, I know AtomicSimpleCPU
checkpoints should work.
In fact, this scenario would be the best for me because later on, I'd like to
restore my checkpoints in x86 servers, where ArmV8KvmCPU will not be available
and I could never --restore-with-cpu ArmV8KvmCPU.
But restoring from this checkpoint causes " fatal: fatal condition
!paramInImpl(cp, name, param) occurred: Can't unserialize 'system.cpu:_pid' "
My guess for the latter case was that AtomicSimpleCPU was in the
system.switch_cpu (not in system.cpu) which is not looked up when restoring the
checkpoint.
So a final attempt I did was to set the AtomicSimpleCPU as the default CPU
(testsys.cpu in fs.py) and the ArmV8KvmCPU as the switch_cpu
(testsys.switch_cpus). The idea was to switch cpus right in the start, run with
kvm most of the time, and switch back to atomic just to generate the
checkpoints. Like this, system.cpu should be filled with AtomicSimpleCPU data,
hence I would be able to restore in x86 servers later.
However gem5 returned a segfault when I assigned "testsys.switch_cpus =
switch_cpus", after I created the switch_cpus list with kvm models:
switch_cpus = [ArmV8KvmCPU(switched_out=True, cpu_id=(i)) for i in range(np)]
for i in range(np):
switch_cpus[i].system = testsys
switch_cpus[i].workload = testsys.cpu[i].workload
switch_cpus[i].clk_domain = testsys.cpu[i].clk_domain
switch_cpus[i].isa = testsys.cpu[i].isa
testsys.switch_cpus = switch_cpus # this line causes a gem5
segfault
switch_cpu_list = [(testsys.cpu[i], switch_cpus[i]) for i in range(np)]
I see that using kvm is very common in different scripts on gem5-resources
(https://gem5.googlesource.com/public/gem5-resources/), but they all seem to
use kvm for x86. Is switching to x86 the best solution for my problem? Any
suggestions on the way I'm setting things up?
Again, thank you very much.
_______________________________________________
gem5-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s