Hi Richard, Well I spun up some of the ideas we talked about to see if there was anything to be squeezed out of the function. In the end the results seem to be a washout with my pigz benchmark:
qemu-system-aarch64 -cpu cortex-a57 \ -machine type=virt,virtualization=on,gic-version=3 \ -serial mon:stdio \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=unet,id=virt-net,disable-legacy=on \ -device virtio-scsi-pci,id=virt-scsi,disable-legacy=on \ -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-buster-arm64 \ -device scsi-hd,drive=hd,id=virt-scsi-hd \ -smp 4 -m 4096 \ -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image \ -append "root=/dev/sda2 systemd.unit=benchmark-pigz.service" \ -display none -snapshot | Command | Mean [s] | Min [s] | Max [s] | Relative | |---------+----------------+---------+---------+----------| | Before | 46.597 ± 2.482 | 45.208 | 53.618 | 1.00 | | After | 46.867 ± 2.242 | 45.871 | 53.180 | 1.00 | Maybe the code cleanup itself makes it worthwhile. WDYT? Alex Bennée (5): accel/tcg: rename tb_lookup__cpu_state and hoist state extraction accel/tcg: move CF_CLUSTER calculation to curr_cflags accel/tcg: drop the use of CF_HASH_MASK and rename params include/exec: lightly re-arrange TranslationBlock include/exec/tb-lookup: try and reduce branch prediction issues include/exec/exec-all.h | 20 +++++++++++--------- include/exec/tb-lookup.h | 34 +++++++++++++++++----------------- accel/tcg/cpu-exec.c | 31 ++++++++++++++++++------------- accel/tcg/tcg-runtime.c | 6 ++++-- accel/tcg/translate-all.c | 14 ++++++++------ softmmu/physmem.c | 2 +- 6 files changed, 59 insertions(+), 48 deletions(-) -- 2.20.1
