On Thu, Feb 21, 2013 at 2:41 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
> Richard,
>
> This regression was introduced by Kai
>
> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01988.html
>
> 2011-06-27  Kai Tietz  <kti...@redhat.com>
>
>         * tree-ssa-forwprop.c (simplify_bitwise_binary): Improve
>         type sinking.
>         * tree-ssa-math-opts.c (execute_optimize_bswap): Separate
>         search for di/si mode patterns for finding widest match.
>
> Is it sufficient for you to accept my patch?

No, I still don't think the fix is sound.  A proper fix would revert the
above change (the simplify_bitwise_binary one), watch for testsuite
fallout (I can immediately see gcc.dg/tree-ssa/bitwise-sink.c failing)
and fix those failures in a better way.

Richard.

> Best regards.
> yuri.
>
> 2013/2/21 Richard Biener <richard.guent...@gmail.com>:
>> On Thu, Feb 21, 2013 at 1:39 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>>> Richard,
>>>
>>> As we know Kai is working on this problem for 4.9 and I assume that
>>> type sinking will be deleted from forwprop pass. Could we stay on this
>>> fix but more common fix will be done.
>>
>> Well, unless you show it is a regression the patch is not applicable for 4.8
>> anyway.  Not sure if the code will be deleted from forwprop pass in 4.9 
>> either,
>> it is after all a canonicalization - fold seems to perform the opposite one
>> though:
>>
>>       /* Convert (T)(x & c) into (T)x & (T)c, if c is an integer
>>          constants (if x has signed type, the sign bit cannot be set
>>          in c).  This folds extension into the BIT_AND_EXPR.
>>
>> note that what forwprop does (T)x & c -> (T)(x & c') I'd call type hoisting,
>> not sinking.  Generally frontends and fold try to narrow operands when
>> possible (even though some targets later widen them again because of
>> instruction set constraints).
>>
>> Most of this hoisting code was done to make lowering logical && and ||
>> I believe.  Looking at the testcases added tells us that while matching
>> the two patterns as done now helps them but only because that pattern
>> feeds single-operand instructions that then simplify.  So doing the transform
>> starting from that single-operand instructions instead looks like a better
>> fix (op !=/== 0/1 and (T) op) and also would not disagree with the
>> canonicalization done by fold.
>>
>> Richard.
>>
>>> I also can propose to introduce new hook for it but need to know your
>>> opinion since we don't went to waste our time on preparing dead
>>> patches. Note that x86 supports all short types in HW and such type
>>> sinkning is usually useless if short types are involved.
>>>
>>> Best regards.
>>> Yuri.
>>>
>>>
>>> 2013/2/21 Richard Biener <richard.guent...@gmail.com>:
>>>> On Wed, Feb 20, 2013 at 4:41 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>> wrote:
>>>>> Richard,
>>>>>
>>>>> First of all, your proposal to move type sinking to the end of
>>>>> function does not work since we handle each statement in function and
>>>>> we want that 1st type folding of X & C will not happen.
>>>>> Note that we have the following sequence of gimple before forwprop1:
>>>>>
>>>>>    x.0_10 = (signed char) x_8;
>>>>>   _11 = x.0_10 & 1;
>>>>>   _12 = (signed char) y_9;
>>>>>   _13 = _12 & 1;
>>>>>   _14 = _11 ^ _13;
>>>>
>>>> Ah, indeed.  Reminds me of some of my dead patches that separated
>>>> forwprop into a forward and backward stage.  Of course then you have
>>>> the ordering issue of whether to first forward or backward.
>>>>
>>>> Which means that I bet you can construct a testcase that with
>>>> your change is no longer optimized (just make pushing the conversion
>>>> make the types _match_).  Which is always the case
>>>> with this kind of local pattern-matching transforms.
>>>>
>>>> Currently forwprop processes leafs of expression trees first (well, inside
>>>> a basic-block), similar to how fold () is supposed to be operated, based
>>>> on the idea that simplified / canonicalized leafs helps keeping pattern
>>>> recognition simple and cost considerations more accurate.
>>>>
>>>> When one order works better than another you always have to consider
>>>> that the user could already have written the code in a way that results
>>>> in the input that isn't well handled.
>>>>
>>>> Not that this helps very much for the situation ;)
>>>>
>>>> But I don't like the use of first_pass_instance ... and the fix isn't
>>>> an improvement but just a hack for the benchmark.
>>>>
>>>> Richard.
>>>>
>>>>> I also added comment to my fix and create new test for it. I also
>>>>> checked that this test is passed with patched compiler  only. So
>>>>> Change Log was also modified:
>>>>>
>>>>> ChangeLog
>>>>>
>>>>> 2013-02-20  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>
>>>>>         PR tree-optimization/56175
>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary): Avoid type 
>>>>> sinking
>>>>>         at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>         for short integer types.
>>>>>         * gcc.dg/pr56175.c: New test.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2013/2/20 Richard Biener <richard.guent...@gmail.com>:
>>>>>> On Wed, Feb 20, 2013 at 1:00 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>> wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>> pattern in simpify_bitwise_binary for short integer types.
>>>>>>> The fix is very simple - we simply turn off short type sinking at the
>>>>>>> first pass of forward propagation allows to get
>>>>>>> +10% speedup for important benchmark Coremark 1.0 at x86 Atom and
>>>>>>> +5-7% for other x86 platforms too.
>>>>>>> Bootstrapping and regression testing were successful on x86-64.
>>>>>>>
>>>>>>> Is it Ok for trunk?
>>>>>>
>>>>>> It definitely needs a comment before the checks.
>>>>>>
>>>>>> Also I think it simply shows that the code is placed at the wrong spot.
>>>>>> Simply moving it down in simplify_bitwise_binary to be done the very last
>>>>>> should get both of the effects done.
>>>>>>
>>>>>> Can you rework the patch according to that?
>>>>>>
>>>>>> You also miss a testcase, we should make sure to not regress again here.
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>> ChangeLog.
>>>>>>>
>>>>>>> 2013-02-20  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>
>>>>>>>         PR tree-optimization/56175
>>>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type 
>>>>>>> sinking
>>>>>>>         at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) 
>>>>>>> & C
>>>>>>>         for short integer types.

Reply via email to