On Fri, 5 Sep 2025, Julian Ganz wrote:
September 5, 2025 at 9:25 PM, "BALATON Zoltan" wrote:
On Fri, 5 Sep 2025, Julian Ganz wrote:
September 5, 2025 at 1:38 PM, "BALATON Zoltan" wrote:
Have you done any testing on how much overhead this adds
 to interrupt heavy guest workloads? At least for PPC these are already
 much slower than real CPU so I'd like it to get faster not slower.

 No, I have not made any performance measurements. However, given that
 for every single TB execution a similar hook is called already, the
 impact related to other existing plugin infrastructure _should_ be
 neglectible.

 That is, if your workload actually runs any code and is not constantly
 bombarded with interrupts that _do_ result in a trap (which _may_ happen
 during some tests).

 So if you are performance sensitive enough to care, you will very likely
 want to disable plugins anyway.

I can disable plugins and do that normally but that does not help those who get 
QEMU from their distro (i.e. most users). If this infrastructure was disabled 
in default builds and needed an explicit option to enable then those who need 
it could enable it and not imposed it on everyone else who just get a default 
build from a distro and never use plugins. Having an option which needs rebuild 
is like not having the option for most people. I guess the question is which is 
the larger group? Those who just run guests or those who use this 
instrumentation with plugins.

Hard to say.

The default may better be what the larger group needs. Even then distros may 
still change the default so it would be best if the overhead can be minimised 
even if enabled. I think the log infrastructure does that, would a similar 
solution work here?

For testing I've found that because embedded PPC CPUs have a software 
controlled MMU (and in addition to that QEMU may flush TLB entries too often) 
running something that does a lot of memory access like runnung the STREAM 
benchmark on sam460ex is hit by this IIRC but anything else causing a lot of 
interrupts like reading from emulated disk or sound is probably affected as 
well. I've tried to optimise PPC exception handling a bit before but whenever I 
optimise something it is later undone by other changes not caring about 
performance.

I could try running the benchmark on multiple versions:

* qemu with plugins disabled,
* with plugins enabled but without these patches and
* with plugins enabled and with these patches.

However, I'll likely only report back with results next week, though.
Do you happen to have an image you can point me to? Either something
that has the benchmark already or some unixoid running on the platform?
I'm currently not motivated enough to cook up some bare-metal testbed
for a platform I'm not familiar with.

I don't have ready images to test embedded PPC MMU exceptions which I think this may affect most. I had an image for pegasos2 for a general test used here:
https://lists.nongnu.org/archive/html/qemu-discuss/2023-12/msg00008.html
but that machine has a G4 CPU which has hardware MMU so is likely not affected.

I have uploaded some PPC binaries for the STREAM benchmark that I tested with before here:
http://zero.eik.bme.hu/~balaton/qemu/stream-test.zip
which may excercise this if run on sam460ex or ppce500 machines but I don't have a scripted test case for that. There is some docs on how to run Linux on these machines here:
https://www.qemu.org/docs/master/system/target-ppc.html

Alternatively maybe running a disk IO benchmark on an emulated IDE controller using PIO mode or some other device that generates a lots of interrupts may test this. I think you can use the "info irq" command in QEMU Monitor to check how many interrupts you get.

Regards,
BALATON Zoltan

Reply via email to