On Tue, 8 Jan 2019 at 16:30, Peter Maydell <[email protected]> wrote: > > Include the cluster number in the hash we use to look > up TBs. This is important because a TB that is valid > for one cluster at a given physical address and set > of CPU flags is not necessarily valid for another: > the two clusters may have different views of physical > memory, or may have different CPU features (eg FPU > present or absent). > > We put the cluster number in the high 8 bits of the > TB cflags. This gives us up to 256 clusters, which should > be enough for anybody. If we ever need more, or need > more bits in cflags for other purposes, we could make > tb_hash_func() take more data (and expand qemu_xxhash7() > to qemu_xxhash8()). > > Signed-off-by: Peter Maydell <[email protected]> > --- > include/exec/exec-all.h | 4 +++- > accel/tcg/cpu-exec.c | 4 ++++ > accel/tcg/translate-all.c | 3 +++ > 3 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h > index 815e5b1e838..aa7b81aaf01 100644 > --- a/include/exec/exec-all.h > +++ b/include/exec/exec-all.h > @@ -351,9 +351,11 @@ struct TranslationBlock { > #define CF_USE_ICOUNT 0x00020000 > #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */ > #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */ > +#define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */ > +#define CF_CLUSTER_SHIFT 24 > /* cflags' mask for hashing/comparison */ > #define CF_HASH_MASK \ > - (CF_COUNT_MASK | CF_LAST_IO | CF_USE_ICOUNT | CF_PARALLEL) > + (CF_COUNT_MASK | CF_LAST_IO | CF_USE_ICOUNT | CF_PARALLEL | > CF_CLUSTER_MASK) > > /* Per-vCPU dynamic tracing state used to generate this TB */ > uint32_t trace_vcpu_dstate; > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index 870027d4359..e578a1a3aee 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -336,6 +336,10 @@ TranslationBlock *tb_htable_lookup(CPUState *cpu, > target_ulong pc, > return NULL; > } > desc.phys_page1 = phys_pc & TARGET_PAGE_MASK; > + > + cf_mask &= ~CF_CLUSTER_MASK; > + cf_mask |= cpu->cluster_index << CF_CLUSTER_SHIFT; > +
This hunk turns out not to be quite right -- it needs to move to the top of the function, before the assignment "desc.flags = flags;". Otherwise tb_lookup_cmp() will spuriously fail, and execution becomes somewhat slower because we have to keep retranslating TBs rather than reusing them. (Surprisingly this is only noticeable in an ARM TFM image I happen to have, not in Linux kernel boot...) thanks -- PMM
