https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63281

--- Comment #9 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
Also reported by Donald Stence this week:

The compiler produces excessive sequences to synthesize some literal constants.
This contributes excess path length and potentially latency.
Constants requiring only 2 or 3 instructions is acceptable. More than 3 should
be optimized via a load from the GOT (i.e., data in GOT).

Compile test case either -O or -O3, default processor (AT-11.0-0).

Example constant from perlbench: 0x000800004100001.
Resulting sequence:
        li    3,0
        ori  3,3,0x8000
        sldi 3,3,32
        oris 3,3,0x410
        ori  3,3,0x1

It was ~20% faster on the block of some 30 instructions prior to the switch in
the top function of perlbench (S_regmatch). That section of code contained two
longer sequences (one 4, the other 5 instructions - with the 5 one capable of
being done in 4 - as [Segher] pointed out), with the rest being 1 or 2
instruction constant synthesization or addi and a out-of-bounds check for the
switch. I replaced the two longer ones with ld off r2 to get the ~20%. Of
course  this is in isolation, but I believe this to be sound.

Reply via email to