Re: [RFC] Tweak reload to const propagate into matching constraint output

Richard Guenther Wed, 27 Jun 2012 01:46:04 -0700

On Wed, Jun 27, 2012 at 5:02 AM, Richard Henderson <r...@redhat.com> wrote:
> The problem I'd like to solve is stuff like
>
>        pxor    %xmm4, %xmm4
> ...
>        movdqa  %xmm4, %xmm2
>        pcmpgtd %xmm0, %xmm2
>
> In that there's no point performing the copy from xmm4
> rather than just emitting a new pxor insn.
>
> The Real Problem, as I see it, is that at the point (g)cse
> runs we have no visibility into the 2-operand matching
> constraint on that pcmpgtd so we make the wrong choice
> in sharing the zero.
>
> If we're using AVX, instead of SSE, we don't use matching
> constraints and given the 3-operand insn, hoisting the zero
> is the right and proper thing to do because we won't need
> to emit that movdqa.
>
> Of course, this fires for normal integer code as well.
> Some cases it's a clear win:
>
> -:      41 be 1f 00 00 00       mov    $0x1f,%r14d
> ...
> -:      4c 89 f1                mov    %r14,%rcx
> +:      b9 1f 00 00 00          mov    $0x1f,%ecx
>
> sometimes not (increased code size):
>
> -:      41 bd 01 00 00 00       mov    $0x1,%r13d
> -:      4d 89 ec                mov    %r13,%r12
> +:      41 bc 01 00 00 00       mov    $0x1,%r12d
> +:      41 bd 01 00 00 00       mov    $0x1,%r13d


I suppose that might be fixed if instead of

+  /* Only use the constant when it's just as cheap as a reg move.  */
+  if (set_src_cost (c, optimize_function_for_speed_p (cfun)) == 0)
+    return c;

you'd unconditionall use size costs?

> although the total difference is minimal, and ambiguous:
>
>        new text        old text
> cc1     13971302        13971342
> cc1plus 15882736        15882728
>
> Also, note that in the first case above, r14 is otherwise
> unused, and we wind up with an unnecessary save/restore of
> the register in the function.
>
> Thoughts?

We have an inverse issue elsewhere in that we don't CSE a propagated constant
but get

   mov $0, %(eax)
   mov $0, 4%(eax)
...

instead of doing one register clearing and then re-using that as zero.  But I
suppose reload is not exactly the place to fix that ;)

Richard.


>
> r~

Re: [RFC] Tweak reload to const propagate into matching constraint output

Reply via email to