Re: [PATCH] Fix PR81090, properly free niter estimates

Christophe Lyon Wed, 21 Jun 2017 05:47:22 -0700

On 20 June 2017 at 11:45, Richard Biener <rguent...@suse.de> wrote:
> On Tue, 20 Jun 2017, Alan Hayward wrote:
>
>>
>> > On 19 Jun 2017, at 13:35, Richard Biener <rguent...@suse.de> wrote:
>> >
>> > On Mon, 19 Jun 2017, Christophe Lyon wrote:
>> >
>> >> Hi Richard,
>> >>
>> >> On 16 June 2017 at 14:18, Richard Biener <rguent...@suse.de> wrote:
>> >>> On Wed, 14 Jun 2017, Richard Biener wrote:
>> >>>
>> >>>>
>> >>>> niter estimates are not kept up-to-date (they reference gimple stmts
>> >>>> and trees) in the keep-loop-stuff infrastructure so similar to the
>> >>>> SCEV cache we rely on people freeing it after passes.
>> >>>>
>> >>>> The following brings us a step closer to that by freeing them whenever
>> >>>> SCEV is invalidated (we only compute them when SCEV is active) plus
>> >>>> removing the odd record-bounds pass that just computes them, leaving
>> >>>> scavenging to following passes.
>> >>>>
>> >>>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>> >>>
>> >>> Some awkward interactions with peeling means I'm installing the
>> >>> following less aggressive variant.
>> >>>
>> >>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>> >>>
>> >>> Richard.
>> >>>
>> >>> 2017-06-16  Richard Biener  <rguent...@suse.de>
>> >>>
>> >>>        PR tree-optimization/81090
>> >>>        * passes.def (pass_record_bounds): Remove.
>> >>>        * tree-pass.h (make_pass_record_bounds): Likewise.
>> >>>        * tree-ssa-loop.c (pass_data_record_bounds, pass_record_bounds,
>> >>>        make_pass_record_bounds): Likewise.
>> >>>        * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Do
>> >>>        not free niter estimates at the beginning but at the end.
>> >>>        * tree-scalar-evolution.c (scev_finalize): Free niter estimates.
>> >>>
>> >>>        * gcc.dg/graphite/pr81090.c: New testcase.
>> >>>
>> >>
>> >> Sorry to bother you again...
>> >> With this commit (r249249), I've noticed regressions on aarch64/arm:
>> >> FAIL:    gcc.dg/vect/pr65947-9.c -flto -ffat-lto-objects
>> >> scan-tree-dump-not vect "LOOP VECTORIZED"
>> >> FAIL:    gcc.dg/vect/pr65947-9.c scan-tree-dump-not vect "LOOP VECTORIZED"
>> >
>> > So the testcase gets vectorized now (for whatever reason) and still passes
>> > execution.  Not sure why the testcase checked for not being vectorized.
>> >
>> > Alan?
>> >
>> > Richard.
>>
>> I’ve not looked at the new patch, but pr65947-9.c was added to test:
>>
>> + /* Condition reduction with maximum possible loop size.  Will fail to
>> +    vectorize because the vectorisation requires a slot for default values. 
>>  */
>>
>> So, in the pr65947-9.c, if nothing passes the IF clause, then LAST needs to 
>> be
>> set to -72.
>
> So the runtime part of the testcase fails to test this case and we expect
> it to FAIL if vectorized?
>
> Index: testsuite/gcc.dg/vect/pr65947-9.c
> ===================================================================
> --- testsuite/gcc.dg/vect/pr65947-9.c   (revision 249145)
> +++ testsuite/gcc.dg/vect/pr65947-9.c   (working copy)
> @@ -34,9 +34,9 @@ main (void)
>
>    check_vect ();
>
> -  char ret = condition_reduction (a, 16);
> +  char ret = condition_reduction (a, 1);
>
> -  if (ret != 10)
> +  if (ret != -72)
>      abort ();
>
>    return 0;
>
> On aarch64 I can reproduce the inline copy in main to be vectorized
> (doesn't happen on x86_64).  niter analysis says:
>
> Analyzing # of iterations of loop 1
>   exit condition [253, + , 4294967295] != 0
>   bounds on difference of bases: -253 ... -253
>   result:
>     # of iterations 253, bounded by 253
> Analyzing # of iterations of loop 1
>   exit condition [253, + , 4294967295] != 0
>   bounds on difference of bases: -253 ... -253
>   result:
>     # of iterations 253, bounded by 253
> Statement (exit)if (ivtmp_45 != 0)
>  is executed at most 253 (bounded by 253) + 1 times in loop 1.
>
> so it fits there.  While the offline copy has
>
> Analyzing # of iterations of loop 1
>   exit condition [254, + , 4294967295] != 0
>   bounds on difference of bases: -254 ... -254
>   result:
>     # of iterations 254, bounded by 254
> Analyzing # of iterations of loop 1
>   exit condition [254, + , 4294967295] != 0
>   bounds on difference of bases: -254 ... -254
>   result:
>     # of iterations 254, bounded by 254
> Statement (exit)if (ivtmp_7 != 0)
>  is executed at most 254 (bounded by 254) + 1 times in loop 1.
>
> we peeled one iteration (ch_loop does that) so we have the place
> left.
>
> Marking the function noinline works as a fix I guess.
>
> Tested on x86_64-unknown-linux-gnu, installed.
>
> Richard.
>
> 2017-06-20  Richard Biener  <rguent...@suse.de>
>
>         * gcc.dg/vect/pr65947-9.c: Adjust.


Hi,

After this change (r249400), the test fails on aarch64/arm:
FAIL:    gcc.dg/vect/pr65947-9.c -flto -ffat-lto-objects
scan-tree-dump vect "loop size is greater than data size"
FAIL:    gcc.dg/vect/pr65947-9.c scan-tree-dump vect "loop size is
greater than data size"

Christophe

>
> Index: gcc/testsuite/gcc.dg/vect/pr65947-9.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vect/pr65947-9.c       (revision 249145)
> +++ gcc/testsuite/gcc.dg/vect/pr65947-9.c       (working copy)
> @@ -9,10 +9,10 @@ extern void abort (void) __attribute__ (
>  /* Condition reduction with maximum possible loop size.  Will fail to
>     vectorize because the vectorisation requires a slot for default values.  
> */
>
> -char
> +signed char __attribute__((noinline,noclone))
>  condition_reduction (char *a, char min_v)
>  {
> -  char last = -72;
> +  signed char last = -72;
>
>    for (int i = 0; i < N; i++)
>      if (a[i] < min_v)
> @@ -21,10 +21,10 @@ condition_reduction (char *a, char min_v
>    return last;
>  }
>
> -char
> -main (void)
> +int
> +main ()
>  {
> -  char a[N] = {
> +  signed char a[N] = {
>    11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
>    1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
>    21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
> @@ -34,11 +34,14 @@ main (void)
>
>    check_vect ();
>
> -  char ret = condition_reduction (a, 16);
> -
> +  signed char ret = condition_reduction (a, 16);
>    if (ret != 10)
>      abort ();
>
> +  ret = condition_reduction (a, 1);
> +  if (ret != -72)
> +    abort ();
> +
>    return 0;
>  }
>

Re: [PATCH] Fix PR81090, properly free niter estimates

Reply via email to