[Bug target/82267] x32: unnecessary address-size prefixes. Why isn't -maddress-mode=long the default?

peter at cordes dot ca Tue, 26 Sep 2017 09:52:01 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82267


--- Comment #6 from Peter Cordes <peter at cordes dot ca> ---
(In reply to H.J. Lu from comment #2)
> > Are there still cases where -maddress-mode=long makes worse code?
> 
> 
> Yes, there are more places where -maddress-mode=long needs to zero-extend
> address to 64 bits where 0x67 prefix does for you.

So ideally, gcc should use 0x67 opportunistically where it saves a
zero-extension instruction.

Using 64-bit address size opportunistically wherever we're sure it's safe seems
like a good idea, but I assume that's not easy to implement.

Can we teach  -maddress-mode=long  that a 0x67 prefix is a nearly-free way to
zero-extend as part of an addressing-mode, so it will use that instead of extra
instructions?


> > SSSE3 and later instructions need 66 0F 3A/38 before the opcode, so an
> > address-size or REX prefix will cause a decode stall on Silvermont.  With
> 
> That is true.

> > Similarly, Bulldozer-family has a 3-prefix limit, but doesn't
> > count escape bytes, and VEX only counts as 0 or 1 (for 2/3 byte VEX).
> 
> But 0x67 prefix is still better.

For tune=silvermont or knl, ideally we'd count prefixes and use an extra
instruction when it avoids a decode bottleneck.

For tune=generic we should probably always use 0x67 when it saves an
instruction.  IDK about tune=bdver2.  Probably not worth worrying about too
much.


> Since the upper 32 bits of stack register are always zero for x32, we
> can encode %esp as %rsp to avoid 0x67 prefix in address if there is no
> index or base register.

Note that %rsp can't be an index register, so you only have to check if it's
the base register.

The SIB encodings that would mean index=RSP actually mean "no index".  The
ModRM encoding that would mean base=RSP instead means "there's a SIB byte". 
https://stackoverflow.com/a/46263495/224132

This means that `(%rsp)` is encodeable, instead of (%rsp, %rsp, scale).  Any
other register can be used as a base with no SIB byte (unfortunately for
code-size with -fomit-frame-pointer).

Can this check be applied to  %rbp  in functions that use a frame pointer?

That might be possible even if we can't as easily decide whether other
registers need to be zero or sign extended if we're not sure whether they're
"the pointer" or a signed integer pointer-difference.

However, simple dereference addressing modes (one register, no displacement)
can always use 64-bit address size when the register is known to be
zero-extended.

[Bug target/82267] x32: unnecessary address-size prefixes. Why isn't -maddress-mode=long the default?

Reply via email to