Hi!

On Fri, Jul 29, 2022 at 07:57:51AM +0100, Roger Sayle wrote:
> > On Wed, Jul 27, 2022 at 02:42:25PM +0100, Roger Sayle wrote:
> > > This patch implements some additional zero-extension and
> > > sign-extension related optimizations in simplify-rtx.cc.  The original
> > > motivation comes from PR rtl-optimization/71775, where in comment #2
> > Andrew Pinski sees:
> > >
> > > Failed to match this instruction:
> > > (set (reg:DI 88 [ _1 ])
> > >     (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0)))
> > >
> > > On many platforms the result of DImode CTZ is constrained to be a
> > > small unsigned integer (between 0 and 64), hence the truncation to
> > > 32-bits (using a SUBREG) and the following sign extension back to
> > > 64-bits are effectively a no-op, so the above should ideally (often)
> > > be simplified to "(set (reg:DI 88) (ctz:DI (reg/v:DI 86 [ x ]))".
> > 
> > And you can also do that if ctz is undefined for a zero argument!
> 
> Forgive my perhaps poor use of terminology.  The case of ctz 0 on
> x64_64 isn't "undefined behaviour" (UB) in the C/C++ sense that
> would allow us to do anything, but implementation defined (which
> Intel calls "undefined" in their documentation).

This is about CTZ in RTL, in GCC.  CTZ_DEFINED_VALUE_AT_ZERO is 0 here,
which means a zero argument gives an undefined result.

> Hence, we don't
> know which DI value is placed in the result register.  In this case,
> truncating to SI mode, then sign extending the result is not a no-op,
> as the top bits will/must now all be the same [though admittedly to an
> unknown undefined signbit].

And any value is valid.

> Hence the above optimization would 
> be invalid, as it doesn't guarantee the result would be sign-extended.

It does not have to be!  Truncating an undefined DImode value to SIMode
gives an undefined SImode value.  On most architectures (including x86
afaik) you do not need to do any machine insn for that (the top 32 bits
in the register are just ignored for a SImode value).

> > Also, this is not correct for C[LT]Z_DEFINED_VALUE_AT_ZERO non-zero if the
> > value it returns in its second arg does not survive sign extending
> unmodified (if it
> > is 0xffffffff for an extend from SI to DI for example).
> 
> Fortunately, C[LT]Z_DEFINED_VALUE_AT_ZERO being defined to return a negative
> result, such as -1 is already handled (accounted for) in nonzero_bits.  The
> relevant
> code in rtlanal.cc's nonzero_bits1 is:

A negative result, yes.  But that was not my example.


Segher

Reply via email to