Hi,

On 3 February 2017 at 08:56, Jeff Law <l...@redhat.com> wrote:
> On 02/02/2017 01:29 PM, Jan Hubicka wrote:
>>
>> Hi,
>> this patches fixes profile updating in the ifcombine.  This is not hard to
>> do
>> and ifcombine is #2 profile update offender out of tree passes (#1 is the
>> vectorizer).
>>
>> I think this counts as a regression, becuase one can trigger arbitrarily
>> bad profile after ifconversion and defnitly construct a testcase where
>> this
>> will cause us optimize for size where we optimized for speed previously.
>>
>> Bootstrapped/regtested x86_64-linux. Will commit it tomorrow (after
>> testers
>> pick up the threading fix) unless there are complains.
>>
>>         * gcc.dg/tree-ssa/ssa-ifcombine-1.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-2.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-3.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-4.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-5.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-6.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-7.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-8.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-9.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-10.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-11.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-12.c: Check for no profile
>> mismatches.
>>         * gcc.dg/tree-ssa/ssa-ifcombine-13.c: Check for no profile
>> mismatches.
>>
>>         * tree-ssa-ifcombine.c (update_profile_after_ifcombine): New
>> function.
>>         (ifcombine_ifandif): Use it.
>>
>> Index: testsuite/gcc.dg/tree-ssa/threadbackward-1.c
>> ===================================================================
>> --- testsuite/gcc.dg/tree-ssa/threadbackward-1.c        (revision 0)
>> +++ testsuite/gcc.dg/tree-ssa/threadbackward-1.c        (working copy)
>> @@ -0,0 +1,9 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-ethread" } */
>> +char *c;
>> +int t()
>> +{
>> +  for (int i=0;i<5000;i++)
>> +    c[i]=i;
>> +}
>> +/* { dg-final { scan-tree-dump-times "Registering FSM jump thread" 1
>> "ethread"} } */
>
> Seems like this is unrelated to fixing profile updates in ifcombine.  If you
> want the new test it's probably best done in another independent patch :-)
>
>
>> Index: tree-ssa-ifcombine.c
>> ===================================================================
>> --- tree-ssa-ifcombine.c        (revision 245134)
>> +++ tree-ssa-ifcombine.c        (working copy)
>> @@ -332,6 +332,51 @@ recognize_bits_test (gcond *cond, tree *
>>    return true;
>>  }
>>
>> +
>> +/* Update profile after code in outer_cond_bb was adjuted so
>
> s/adjuted/adjusted/
>
> OK with the nit fixed.
>
> jeff
>

After this patch (r245151), I've noticed regressions on aarch64:
  gcc.target/aarch64/test_frame_1.c scan-assembler-times ldr\tx30,
\\[sp\\], [0-9]+ 2
  gcc.target/aarch64/test_frame_2.c scan-assembler-times ldp\tx19,
x30, \\[sp\\], [0-9]+ 1
  gcc.target/aarch64/test_frame_4.c scan-assembler-times ldp\tx19,
x30, \\[sp\\], [0-9]+ 1
  gcc.target/aarch64/test_frame_6.c scan-assembler-times ldr\tx30, \\[sp\\] 2
  gcc.target/aarch64/test_frame_7.c scan-assembler-times ldp\tx19,
x30, \\[sp\\] 1

now FAIL.

For instance in test_frame_1.c, we used to generate:
    ccmp    w2, w1, 0, eq
    bne    .L6
    ldrb    w0, [sp, 123]
    ldrb    w1, [sp, 124]
    cmp    w0, 99
    ccmp    w1, w0, 0, eq
    cset    w0, eq
.L6:
    ldr    x30, [sp], 224
    ret

and now:
    ccmp    w2, w1, 0, eq
    beq    .L11
    ldr    x30, [sp], 224
    ret
    .p2align 3
.L11:
    ldrb    w0, [sp, 123]
    ldrb    w1, [sp, 124]
    cmp    w0, 99
    ldr    x30, [sp], 224
    ccmp    w1, w0, 0, eq
    cset    w0, eq
    ret

which is 2 instructions more, as the control flow is less efficient.

Do we want to just update the tests (increasing the number of expected
str/ldr/stp/ldp to match the new code generation), or do we
consider this a regression caused by this patch?

Christophe

Reply via email to