On Sat, Aug 13, 2022 at 12:35 AM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > Hi Richard, > > > -----Original Message----- > > From: Richard Biener <richard.guent...@gmail.com> > > Sent: 08 August 2022 12:49 > > Subject: Re: [PATCH] PR tree-optimization/64992: (B << 2) != 0 is B when B > > is > > Boolean. > > > > On Mon, Aug 8, 2022 at 11:06 AM Roger Sayle > > <ro...@nextmovesoftware.com> wrote: > > > > > > This patch resolves both PR tree-optimization/64992 and PR > > > tree-optimization/98956 which are missed optimization enhancement > > > request, for which Andrew Pinski already has a proposed solution > > > (related to a fix for PR tree-optimization/98954). Yesterday, I > > > proposed an alternate improved patch for PR98954, which although > > > superior in most respects, alas didn't address this case [which > > > doesn't include a BIT_AND_EXPR], hence this follow-up fix. > > > > > > For many functions, F(B), of a (zero-one) Boolean value B, the > > > expression F(B) != 0 can often be simplified to just B. Hence "(B * > > > 5) != 0" is B, "-B != 0" is B, "bswap(B) != 0" is B, "(B >>r 3) != 0" > > > is B. These are all currently optimized by GCC, with the strange > > > exception of left shifts by a constant (possibly due to the > > > undefined/implementation defined behaviour when the shift constant is > > > larger than the first operand's precision). > > > This patch adds support for this particular case, when the shift > > > constant is valid. > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > > and make -k check, both with and without --target_board=unix{-m32}, > > > with no new failures. Ok for mainline? > > > > +/* (X << C) != 0 can be simplified to X, when X is zero_one_valued_p. > > +*/ (simplify > > + (ne (lshift zero_one_valued_p@0 INTEGER_CST@1) integer_zerop@2) > > + (if (tree_fits_shwi_p (@1) > > + && tree_to_shwi (@1) > 0 > > + && tree_to_shwi (@1) < TYPE_PRECISION (TREE_TYPE (@0))) > > + (convert @0))) > > > > while we deliberately do not fold int << 34 since the result is undefined > > there is > > IMHO no reason to not fold the above for any (even non-constant) shift > > value. > > We have guards with TYPE_OVERFLOW_SANITIZED in some cases but I think > > that's not appropriate here, there's one flag_sanitize check, maybe there's > > a > > special bit for SHIFT overflow we can use. Why is (X << 0) != 0 excempt in > > the > > condition? > > In this case, I think it makes more sense to err on the side of caution, and > avoid changing the observable behaviour of programs, even in cases were > the behaviour is officially undefined. For many targets, (1<<x) != 0 is > indeed > always true for any value of x, but a counter example are x86's SSE shifts, > where shifts beyond the size of the vector result in zero. With STV, this > means that (1<<258) != 0 has a different value if performed as scalar vs. > performed as vector. Worse, one may end up with examples, where based > upon optimization level, we see different results as shift operands become > propagated constants in some paths, but was variable shifts others. > > Hence my personal preference is "first, do no harm" and limit this > transformation to the safe 0 <= X < MODE_PRECISION (mode). > Then given we'd like to avoid negative shifts, and therefore need to > test against zero, my second preference is "0 < X" over "0 <= X". > If the RTL contains a shift by zero, something strange is already going on > (these should be caught optimized elsewhere), and it's better to leave > these issues visible in the RTL, than paper over any "latent" mistakes. > > I fully I agree that this optimization could be more aggressive, but that > isn't required to resolve this PR, and resolving PR64992 only to open > the door for follow-up "unexpected behavior" PRs isn't great progress. > > Thoughts? Ok for mainline?
OK - can you add a comment reflecting the above? An improvement might be to allow non-constant operands but test sth like expr_in_range (@1, 1, TYPE_PRECISION (...)) we already have expr_not_equal_to, expr_in_range could be done with value_range vr; if (get_global_range_query ()->range_of_expr (vr, @0) && vr.kind () == VR_RANGE) { wide_int wmin0 = vr.lower_bound (); wide_int wmax0 = vr.upper_bound (); ... there's a bunch of range_of_expr uses, I didn't check closely but some might fit a new expr_in_range utility. That can be done as followup (if you like). Btw, I wonder if bit-CCP would have not simplified the shift-by-constant compared to zero as well? If so, does that behave the same with respect to 0 or out of bound shifts? Richard. > > > > 2022-08-08 Roger Sayle <ro...@nextmovesoftware.com> > > > > > > gcc/ChangeLog > > > PR tree-optimization/64992 > > > PR tree-optimization/98956 > > > * match.pd (ne (lshift @0 @1) 0): Simplify (X << C) != 0 to X > > > when X is zero_one_valued_p and the shift constant C is valid. > > > (eq (lshift @0 @1) 0): Likewise, simplify (X << C) == 0 to !X > > > when X is zero_one_valued_p and the shift constant C is valid. > > > > > > gcc/testsuite/ChangeLog > > > PR tree-optimization/64992 > > > * gcc.dg/pr64992.c: New test case. > > > > > Thanks, > Roger > -- > >