On 26/06/2015 16:47, [email protected] wrote:
> From: KONRAD Frederic <[email protected]>
>
> Instead of doing the jump cache invalidation directly in tb_invalidate delay
> it
> after the exit so we don't have an other CPU trying to execute the code being
> invalidated.
>
> Signed-off-by: KONRAD Frederic <[email protected]>
> ---
> translate-all.c | 61
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 59 insertions(+), 2 deletions(-)
>
> diff --git a/translate-all.c b/translate-all.c
> index ade2269..468648d 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -61,6 +61,7 @@
> #include "translate-all.h"
> #include "qemu/bitmap.h"
> #include "qemu/timer.h"
> +#include "sysemu/cpus.h"
>
> //#define DEBUG_TB_INVALIDATE
> //#define DEBUG_FLUSH
> @@ -966,14 +967,58 @@ static inline void tb_reset_jump(TranslationBlock *tb,
> int n)
> tb_set_jmp_target(tb, n, (uintptr_t)(tb->tc_ptr +
> tb->tb_next_offset[n]));
> }
>
> +struct CPUDiscardTBParams {
> + CPUState *cpu;
> + TranslationBlock *tb;
> +};
> +
> +static void cpu_discard_tb_from_jmp_cache(void *opaque)
> +{
> + unsigned int h;
> + struct CPUDiscardTBParams *params = opaque;
> +
> + h = tb_jmp_cache_hash_func(params->tb->pc);
> + if (params->cpu->tb_jmp_cache[h] == params->tb) {
> + params->cpu->tb_jmp_cache[h] = NULL;
> + }
It is a bit more tricky, but I think you can avoid async_run_on_cpu by
doing this:
1) introduce a QemuSeqLock in TBContext, e.g. invalidate_seqlock.
2) wrap this "if" with seqlock_write_lock/unlock
3) in cpu-exec.c do this:
/* we add the TB in the virtual pc hash table */
+ idx = seqlock_read_begin(&tcg_ctx.tb_ctx.invalidate_seqlock);
cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = tb;
+ if (seqlock_read_retry(&tcg_ctx.tb_ctx.invalidate_seqlock)) {
+ /* Another CPU invalidated a tb in the meanwhile. We do not
+ * know if it's this one, but play it safe and avoid caching
+ * it.
+ */
+ cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = NULL;
+ }
> + /* suppress this TB from the two jump lists */
> + tb_jmp_remove(tb, 0);
> + tb_jmp_remove(tb, 1);
If you do the above synchronously, this part doesn't need to be deferred
either.
Then, immediately after the two tb_jmp_remove calls you can also check
whether "(tb->jmp_first & 3) == 2": if so, the expensive expensive
async_run_safe_work_on_cpu can be skipped.
Paolo
> +#endif /* MTTCG */
>
> tcg_ctx.tb_ctx.tb_phys_invalidate_count++;
> tb_unlock();
>