On Fri, 5 Sep 2025, Julian Ganz wrote:
September 5, 2025 at 9:25 PM, "BALATON Zoltan" wrote:
On Fri, 5 Sep 2025, Julian Ganz wrote:
September 5, 2025 at 1:38 PM, "BALATON Zoltan" wrote:
Have you done any testing on how much overhead this adds
to interrupt heavy guest workloads? At least for PPC these are already
much slower than real CPU so I'd like it to get faster not slower.
No, I have not made any performance measurements. However, given that
for every single TB execution a similar hook is called already, the
impact related to other existing plugin infrastructure _should_ be
neglectible.
That is, if your workload actually runs any code and is not constantly
bombarded with interrupts that _do_ result in a trap (which _may_ happen
during some tests).
So if you are performance sensitive enough to care, you will very likely
want to disable plugins anyway.
I can disable plugins and do that normally but that does not help those who get
QEMU from their distro (i.e. most users). If this infrastructure was disabled
in default builds and needed an explicit option to enable then those who need
it could enable it and not imposed it on everyone else who just get a default
build from a distro and never use plugins. Having an option which needs rebuild
is like not having the option for most people. I guess the question is which is
the larger group? Those who just run guests or those who use this
instrumentation with plugins.
Hard to say.
The default may better be what the larger group needs. Even then distros may
still change the default so it would be best if the overhead can be minimised
even if enabled. I think the log infrastructure does that, would a similar
solution work here?
For testing I've found that because embedded PPC CPUs have a software
controlled MMU (and in addition to that QEMU may flush TLB entries too often)
running something that does a lot of memory access like runnung the STREAM
benchmark on sam460ex is hit by this IIRC but anything else causing a lot of
interrupts like reading from emulated disk or sound is probably affected as
well. I've tried to optimise PPC exception handling a bit before but whenever I
optimise something it is later undone by other changes not caring about
performance.
I could try running the benchmark on multiple versions:
* qemu with plugins disabled,
* with plugins enabled but without these patches and
* with plugins enabled and with these patches.
However, I'll likely only report back with results next week, though.
Do you happen to have an image you can point me to? Either something
that has the benchmark already or some unixoid running on the platform?
I'm currently not motivated enough to cook up some bare-metal testbed
for a platform I'm not familiar with.
I don't have ready images to test embedded PPC MMU exceptions which I
think this may affect most. I had an image for pegasos2 for a general test
used here:
https://lists.nongnu.org/archive/html/qemu-discuss/2023-12/msg00008.html
but that machine has a G4 CPU which has hardware MMU so is likely not
affected.
I have uploaded some PPC binaries for the STREAM benchmark that I tested
with before here:
http://zero.eik.bme.hu/~balaton/qemu/stream-test.zip
which may excercise this if run on sam460ex or ppce500 machines but I
don't have a scripted test case for that. There is some docs on how to run
Linux on these machines here:
https://www.qemu.org/docs/master/system/target-ppc.html
Alternatively maybe running a disk IO benchmark on an emulated IDE
controller using PIO mode or some other device that generates a lots of
interrupts may test this. I think you can use the "info irq" command in
QEMU Monitor to check how many interrupts you get.
Regards,
BALATON Zoltan