There are a number of problems with the current interpreter: (1) It uses direct unaligned accesses as part of the bytecode stream. This fails with SIGBUS for strict alignment hosts. This could of course be fixed to use "proper" unaligned accesses, but this would just be slow.
(2) The method by which it implements calls is non-portable, and really only works for x86. (3) The code is full of ifdefs and TODOs that never got completed. (4) The "registers" are in a global "tci_reg" array that implies that if threads are ever used, we'll immediately get corruption. Fixing this is complicated by the structure of the interpreter. (5) It hasn't been updated to the "new" ldst opcodes. To me, all of this adds up to a complete rewrite. While it might just be possible to stage such a thing in, I really don't see a way to do that cleanly. Nor am I certain it's worth the effort. This rewrite fixes all three problems in one go: (1) The bytecodes are encoded into a stream of uint32_t, so there are never any unaligned accesses. (2) Use libffi. This requires a few adjustments to the generic call code, so that the parameters get layed out on the "stack" as we wish. (3) Minimial ifdefage. My goal was to make sure that nearly all code paths are compiled, and then removed by an optimizing compiler if they're not reachable. (4) Don't do that. (5) Do that. The result works on 4 hosts that I could try: x86, arm, sparc, ppc64. The speed of emulation appears to be about the same. Measurement there seems to be within noise. The size of both the interpreter and the bytecodes are greatly reduced. text data bss dec hex filename 7563 0 160 7723 1e2b bld-old/ppc64-softmmu/tci.o 4313 0 8 4321 10e1 bld-new/ppc64-softmmu/tci.o Alpha rom boot -gen code size 64208/33430528 +gen code size 36656/33430528 PPC rom boot -gen code size 188144/33430528 +gen code size 99776/33430528 Sparc rom boot -gen code size 432784/33430528 +gen code size 271104/33430528 Since the patch is essentially illegible, please have a look at git://github.com/rth7680/qemu.git tci-6 r~ Richard Henderson (3): ppc: Disable cacheutils for the interpreter tci: Build ffi data structures for helpers tci: Rewrite from scratch configure | 12 + disas/tci.c | 415 +++++++++- include/exec/cpu-all.h | 6 +- include/exec/exec-all.h | 5 +- include/exec/helper-ffi.h | 83 ++ include/exec/helper-tcg.h | 18 +- include/qemu/cache-utils.h | 2 +- include/qemu/tci.h | 171 +++++ target-i386/ops_sse_header.h | 6 + target-ppc/helper.h | 1 + tcg/tcg.c | 59 +- tcg/tci/README | 113 +-- tcg/tci/tcg-target.c | 1303 ++++++++++++++----------------- tcg/tci/tcg-target.h | 156 ++-- tci.c | 1752 ++++++++++++++---------------------------- translate-all.c | 5 +- util/cache-utils.c | 2 +- 17 files changed, 1976 insertions(+), 2133 deletions(-) create mode 100644 include/exec/helper-ffi.h create mode 100644 include/qemu/tci.h -- 1.9.0