On Mon, 1 Aug 2011, Paolo Bonzini wrote:
On 08/01/2011 05:57 PM, Dimitrios Apostolou wrote:
I don't fully understand the output from -fdump-tree-all, but my
conclusion based also on profiler output and objdump, is that both
unrolling and inlining is happening in both versions. Nevertheless I can
see that assembly output is a bit different in the two cases (I can post
specific disassembly output if you are interested).
Thanks for checking.
Have you tried the idea of passing an unsigned HOST_WIDEST_FAST_INT * (or
whatever the name) to the target hook?
Keeping my patch exactly the same, just changing the
hook_void_hard_reg_set to receive a (HOST_WIDEST_FAST_INT *) arg and doing
the necessary typecasts, added an extra 3 M instructions.
But the ix86_live_on_entry is only called 1233x times from df-scan.c. This
isn't enough to explain all this overhead.
Dimitris