On Thu, Apr 06, 2023 at 12:51:20PM +0200, Eric Botcazou wrote:
> > If we want to fix it in the combiner, I think the fix would be following.
> > The optimization is about
> > (and:SI (subreg:SI (reg:HI xxx) 0) (const_int 0x84c))
> > and IMHO we can only optimize it into
> > (subreg:SI (and:HI (reg:HI xxx) (const_int 0x84c)) 0)
> > if we know that the upper bits of the REG are zeros.
> 
> The reasoning is that, for WORD_REGISTER_OPERATIONS, the subword AND 
> operation 
> is done on the full word register, in other words that it's in effect:
> 
> (subreg:SI (and:SI (reg:SI xxx) (const_int 0x84c)) 0)
> 
> that is equivalent to the initial RTL so correct for WORD_REGISTER_OPERATIONS.

If the
(and:SI (subreg:SI (reg:HI xxx) 0) (const_int 0x84c))
to
(subreg:SI (and:HI (reg:HI xxx) (const_int 0x84c)) 0)
transformation is kosher for WORD_REGISTER_OPERATIONS, then I guess the
invalid operation is then in
simplify_context::simplify_binary_operation_1
    case AND:
...
      if (HWI_COMPUTABLE_MODE_P (mode))
        {
          HOST_WIDE_INT nzop0 = nonzero_bits (trueop0, mode);
          HOST_WIDE_INT nzop1;
          if (CONST_INT_P (trueop1))
            {
              HOST_WIDE_INT val1 = INTVAL (trueop1);
              /* If we are turning off bits already known off in OP0, we need
                 not do an AND.  */
              if ((nzop0 & ~val1) == 0)
                return op0;
            }
We have there op0==trueop0 (reg:HI 175) and op1==trueop1 (const_int 2124
[0x84c]).
We then for integral? modes smaller than word_mode would then need to
actually check nonzero_bits in the word_mode (on paradoxical subreg of
trueop0?).  If INTVAL (trueop1) is >= 0, then I think just doing
nonzero_bits in the wider mode would be all we need (although the
subsequent (nzop1 & nzop0) == 0 case probably wants to have the current
nonzero_bits calls), not really sure what for WORD_REGISTER_OPERATIONS
means AND with a constant which has the most significant bit set for the
upper bits.

So, perhaps just in the return op0; case add further code for
WORD_REGISTER_OPERATIONS and sub-word modes which will call nonzero_bits
again for the word mode and decide if it is still safe.

> > Now, this patch fixes the PR, but certainly generates worse (but correct)
> > code than the dse.cc patch.  So perhaps we want both of them?
> 
> What happens if you disable the step I mentioned (patchlet attached)?

That patch doesn't change anything at all on the testcase, it is still
miscompiled.

        Jakub

Reply via email to