On Mon, Nov 26, 2018 at 15:16:00 +0000, Alex Bennée wrote:
> Emilio G. Cota <[email protected]> writes:
(snip)
> > + if (tb_trans_cb && first_pass) {
> > + qemu_plugin_tb_trans_cb(cpu, plugin_tb);
> > + first_pass = false;
> > + goto translate;
> > + }
>
> So the only reason we are doing this two pass tango is to ensure the
> plugin can insert TCG ops before the actual translation has occurred?
Not only. The idea is to provide plugins with well-defined TBs,
i.e. the instruction sizes and contents can be queried by the plugin
before the plugin decides how/where to instrument the TB.
Since in the targets we generate TCG code and also generate
host code in a single shot (TranslatorOps.translate_insn),
the 2-pass approach is a workaround to first get the
well-defined TB, and in the second pass inject the instrumentation
in the appropriate places.
This is a bit of a waste but given that it only happens at
translate time, it can have negligible performance impact --
I measured a 0.13% gmean slowdown for SPEC06int.
> I think we can do better, especially as the internal structures of
> TCGops are implemented as a list so ops and be inserted before and after
> other ops. This is currently only done by the optimiser at the moment,
> see:
>
> TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *op, TCGOpcode opc, int
> narg);
> TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *op, TCGOpcode opc, int
> narg);
>
> and all the base tcg ops end up going to tcg_emit_op which just appends
> to the tail. But if we can come up with a neater way to track the op
> used before the current translated expression we could do away with two
> phases translation completely.
This list of ops is generated via TranslatorOps.translate_insn.
Unfortunately, this function also defines the guest insns that form the TB.
Decoupling the two actions ("define the TB" and "translate to TCG ops")
would be ideal, but I didn't want to rewrite all the target translators
in QEMU, and opted instead for the 2-pass approach as a compromise.
Thanks,
Emilio