On Sat, Aug 13, 2022 at 12:35 AM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
> Hi Richard,
>
> > -----Original Message-----
> > From: Richard Biener <richard.guent...@gmail.com>
> > Sent: 08 August 2022 12:49
> > Subject: Re: [PATCH] PR tree-optimization/64992: (B << 2) != 0 is B when B 
> > is
> > Boolean.
> >
> > On Mon, Aug 8, 2022 at 11:06 AM Roger Sayle
> > <ro...@nextmovesoftware.com> wrote:
> > >
> > > This patch resolves both PR tree-optimization/64992 and PR
> > > tree-optimization/98956 which are missed optimization enhancement
> > > request, for which Andrew Pinski already has a proposed solution
> > > (related to a fix for PR tree-optimization/98954).  Yesterday, I
> > > proposed an alternate improved patch for PR98954, which although
> > > superior in most respects, alas didn't address this case [which
> > > doesn't include a BIT_AND_EXPR], hence this follow-up fix.
> > >
> > > For many functions, F(B), of a (zero-one) Boolean value B, the
> > > expression F(B) != 0 can often be simplified to just B.  Hence "(B *
> > > 5) != 0" is B, "-B != 0" is B, "bswap(B) != 0" is B, "(B >>r 3) != 0"
> > > is B.  These are all currently optimized by GCC, with the strange
> > > exception of left shifts by a constant (possibly due to the
> > > undefined/implementation defined behaviour when the shift constant is
> > > larger than the first operand's precision).
> > > This patch adds support for this particular case, when the shift
> > > constant is valid.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > > and make -k check, both with and without --target_board=unix{-m32},
> > > with no new failures.  Ok for mainline?
> >
> > +/* (X << C) != 0 can be simplified to X, when X is zero_one_valued_p.
> > +*/ (simplify
> > +  (ne (lshift zero_one_valued_p@0 INTEGER_CST@1) integer_zerop@2)
> > +  (if (tree_fits_shwi_p (@1)
> > +       && tree_to_shwi (@1) > 0
> > +       && tree_to_shwi (@1) < TYPE_PRECISION (TREE_TYPE (@0)))
> > +    (convert @0)))
> >
> > while we deliberately do not fold int << 34 since the result is undefined 
> > there is
> > IMHO no reason to not fold the above for any (even non-constant) shift 
> > value.
> > We have guards with TYPE_OVERFLOW_SANITIZED in some cases but I think
> > that's not appropriate here, there's one flag_sanitize check, maybe there's 
> > a
> > special bit for SHIFT overflow we can use.  Why is (X << 0) != 0 excempt in 
> > the
> > condition?
>
> In this case, I think it makes more sense to err on the side of caution, and
> avoid changing the observable behaviour of programs, even in cases were
> the behaviour is officially undefined.  For many targets, (1<<x) != 0 is 
> indeed
> always true for any value of x, but a counter example are x86's SSE shifts,
> where shifts beyond the size of the vector result in zero.  With STV, this
> means that (1<<258) != 0 has a different value if performed as scalar vs.
> performed as vector.  Worse, one may end up with examples, where based
> upon optimization level, we see different results as shift operands become
> propagated  constants in some paths, but was variable shifts others.
>
> Hence my personal preference is "first, do no harm" and limit this
> transformation to the safe 0 <= X < MODE_PRECISION (mode).
> Then given we'd like to avoid negative shifts, and therefore need to
> test against zero, my second preference is "0 < X" over "0 <= X".
> If the RTL contains a shift by zero, something strange is already going on
> (these should be caught optimized elsewhere), and it's better to leave
> these issues visible in the RTL, than paper over any "latent" mistakes.
>
> I fully I agree that this optimization could be more aggressive, but that
> isn't required to resolve this PR, and resolving PR64992 only to open
> the door for follow-up "unexpected behavior" PRs isn't great progress.
>
> Thoughts?  Ok for mainline?

OK - can you add a comment reflecting the above?  An improvement
might be to allow non-constant operands but test sth like

 expr_in_range (@1, 1, TYPE_PRECISION (...))

we already have expr_not_equal_to, expr_in_range could be done
with

        value_range vr;
        if (get_global_range_query ()->range_of_expr (vr, @0)
            && vr.kind () == VR_RANGE)
          {
            wide_int wmin0 = vr.lower_bound ();
            wide_int wmax0 = vr.upper_bound ();
...

there's a bunch of range_of_expr uses, I didn't check closely but
some might fit a new expr_in_range utility.

That can be done as followup (if you like).  Btw, I wonder if
bit-CCP would have not simplified the shift-by-constant compared
to zero as well?  If so, does that behave the same with respect
to 0 or out of bound shifts?

Richard.

>
> > > 2022-08-08  Roger Sayle  <ro...@nextmovesoftware.com>
> > >
> > > gcc/ChangeLog
> > >         PR tree-optimization/64992
> > >         PR tree-optimization/98956
> > >         * match.pd (ne (lshift @0 @1) 0): Simplify (X << C) != 0 to X
> > >         when X is zero_one_valued_p and the shift constant C is valid.
> > >         (eq (lshift @0 @1) 0): Likewise, simplify (X << C) == 0 to !X
> > >         when X is zero_one_valued_p and the shift constant C is valid.
> > >
> > > gcc/testsuite/ChangeLog
> > >         PR tree-optimization/64992
> > >         * gcc.dg/pr64992.c: New test case.
> > >
>
> Thanks,
> Roger
> --
>
>

Reply via email to