On Thu, Feb 21, 2013 at 2:41 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: > Richard, > > This regression was introduced by Kai > > http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01988.html > > 2011-06-27 Kai Tietz <kti...@redhat.com> > > * tree-ssa-forwprop.c (simplify_bitwise_binary): Improve > type sinking. > * tree-ssa-math-opts.c (execute_optimize_bswap): Separate > search for di/si mode patterns for finding widest match. > > Is it sufficient for you to accept my patch?
No, I still don't think the fix is sound. A proper fix would revert the above change (the simplify_bitwise_binary one), watch for testsuite fallout (I can immediately see gcc.dg/tree-ssa/bitwise-sink.c failing) and fix those failures in a better way. Richard. > Best regards. > yuri. > > 2013/2/21 Richard Biener <richard.guent...@gmail.com>: >> On Thu, Feb 21, 2013 at 1:39 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>> Richard, >>> >>> As we know Kai is working on this problem for 4.9 and I assume that >>> type sinking will be deleted from forwprop pass. Could we stay on this >>> fix but more common fix will be done. >> >> Well, unless you show it is a regression the patch is not applicable for 4.8 >> anyway. Not sure if the code will be deleted from forwprop pass in 4.9 >> either, >> it is after all a canonicalization - fold seems to perform the opposite one >> though: >> >> /* Convert (T)(x & c) into (T)x & (T)c, if c is an integer >> constants (if x has signed type, the sign bit cannot be set >> in c). This folds extension into the BIT_AND_EXPR. >> >> note that what forwprop does (T)x & c -> (T)(x & c') I'd call type hoisting, >> not sinking. Generally frontends and fold try to narrow operands when >> possible (even though some targets later widen them again because of >> instruction set constraints). >> >> Most of this hoisting code was done to make lowering logical && and || >> I believe. Looking at the testcases added tells us that while matching >> the two patterns as done now helps them but only because that pattern >> feeds single-operand instructions that then simplify. So doing the transform >> starting from that single-operand instructions instead looks like a better >> fix (op !=/== 0/1 and (T) op) and also would not disagree with the >> canonicalization done by fold. >> >> Richard. >> >>> I also can propose to introduce new hook for it but need to know your >>> opinion since we don't went to waste our time on preparing dead >>> patches. Note that x86 supports all short types in HW and such type >>> sinkning is usually useless if short types are involved. >>> >>> Best regards. >>> Yuri. >>> >>> >>> 2013/2/21 Richard Biener <richard.guent...@gmail.com>: >>>> On Wed, Feb 20, 2013 at 4:41 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>> wrote: >>>>> Richard, >>>>> >>>>> First of all, your proposal to move type sinking to the end of >>>>> function does not work since we handle each statement in function and >>>>> we want that 1st type folding of X & C will not happen. >>>>> Note that we have the following sequence of gimple before forwprop1: >>>>> >>>>> x.0_10 = (signed char) x_8; >>>>> _11 = x.0_10 & 1; >>>>> _12 = (signed char) y_9; >>>>> _13 = _12 & 1; >>>>> _14 = _11 ^ _13; >>>> >>>> Ah, indeed. Reminds me of some of my dead patches that separated >>>> forwprop into a forward and backward stage. Of course then you have >>>> the ordering issue of whether to first forward or backward. >>>> >>>> Which means that I bet you can construct a testcase that with >>>> your change is no longer optimized (just make pushing the conversion >>>> make the types _match_). Which is always the case >>>> with this kind of local pattern-matching transforms. >>>> >>>> Currently forwprop processes leafs of expression trees first (well, inside >>>> a basic-block), similar to how fold () is supposed to be operated, based >>>> on the idea that simplified / canonicalized leafs helps keeping pattern >>>> recognition simple and cost considerations more accurate. >>>> >>>> When one order works better than another you always have to consider >>>> that the user could already have written the code in a way that results >>>> in the input that isn't well handled. >>>> >>>> Not that this helps very much for the situation ;) >>>> >>>> But I don't like the use of first_pass_instance ... and the fix isn't >>>> an improvement but just a hack for the benchmark. >>>> >>>> Richard. >>>> >>>>> I also added comment to my fix and create new test for it. I also >>>>> checked that this test is passed with patched compiler only. So >>>>> Change Log was also modified: >>>>> >>>>> ChangeLog >>>>> >>>>> 2013-02-20 Yuri Rumyantsev <ysrum...@gmail.com> >>>>> >>>>> PR tree-optimization/56175 >>>>> * tree-ssa-forwprop.c (simplify_bitwise_binary): Avoid type >>>>> sinking >>>>> at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C >>>>> for short integer types. >>>>> * gcc.dg/pr56175.c: New test. >>>>> >>>>> >>>>> >>>>> >>>>> 2013/2/20 Richard Biener <richard.guent...@gmail.com>: >>>>>> On Wed, Feb 20, 2013 at 1:00 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>> wrote: >>>>>>> Hi All, >>>>>>> >>>>>>> This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C >>>>>>> pattern in simpify_bitwise_binary for short integer types. >>>>>>> The fix is very simple - we simply turn off short type sinking at the >>>>>>> first pass of forward propagation allows to get >>>>>>> +10% speedup for important benchmark Coremark 1.0 at x86 Atom and >>>>>>> +5-7% for other x86 platforms too. >>>>>>> Bootstrapping and regression testing were successful on x86-64. >>>>>>> >>>>>>> Is it Ok for trunk? >>>>>> >>>>>> It definitely needs a comment before the checks. >>>>>> >>>>>> Also I think it simply shows that the code is placed at the wrong spot. >>>>>> Simply moving it down in simplify_bitwise_binary to be done the very last >>>>>> should get both of the effects done. >>>>>> >>>>>> Can you rework the patch according to that? >>>>>> >>>>>> You also miss a testcase, we should make sure to not regress again here. >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>>> ChangeLog. >>>>>>> >>>>>>> 2013-02-20 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>> >>>>>>> PR tree-optimization/56175 >>>>>>> * tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type >>>>>>> sinking >>>>>>> at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) >>>>>>> & C >>>>>>> for short integer types.