On 20 June 2017 at 11:45, Richard Biener <[email protected]> wrote:
> On Tue, 20 Jun 2017, Alan Hayward wrote:
>
>>
>> > On 19 Jun 2017, at 13:35, Richard Biener <[email protected]> wrote:
>> >
>> > On Mon, 19 Jun 2017, Christophe Lyon wrote:
>> >
>> >> Hi Richard,
>> >>
>> >> On 16 June 2017 at 14:18, Richard Biener <[email protected]> wrote:
>> >>> On Wed, 14 Jun 2017, Richard Biener wrote:
>> >>>
>> >>>>
>> >>>> niter estimates are not kept up-to-date (they reference gimple stmts
>> >>>> and trees) in the keep-loop-stuff infrastructure so similar to the
>> >>>> SCEV cache we rely on people freeing it after passes.
>> >>>>
>> >>>> The following brings us a step closer to that by freeing them whenever
>> >>>> SCEV is invalidated (we only compute them when SCEV is active) plus
>> >>>> removing the odd record-bounds pass that just computes them, leaving
>> >>>> scavenging to following passes.
>> >>>>
>> >>>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>> >>>
>> >>> Some awkward interactions with peeling means I'm installing the
>> >>> following less aggressive variant.
>> >>>
>> >>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>> >>>
>> >>> Richard.
>> >>>
>> >>> 2017-06-16 Richard Biener <[email protected]>
>> >>>
>> >>> PR tree-optimization/81090
>> >>> * passes.def (pass_record_bounds): Remove.
>> >>> * tree-pass.h (make_pass_record_bounds): Likewise.
>> >>> * tree-ssa-loop.c (pass_data_record_bounds, pass_record_bounds,
>> >>> make_pass_record_bounds): Likewise.
>> >>> * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Do
>> >>> not free niter estimates at the beginning but at the end.
>> >>> * tree-scalar-evolution.c (scev_finalize): Free niter estimates.
>> >>>
>> >>> * gcc.dg/graphite/pr81090.c: New testcase.
>> >>>
>> >>
>> >> Sorry to bother you again...
>> >> With this commit (r249249), I've noticed regressions on aarch64/arm:
>> >> FAIL: gcc.dg/vect/pr65947-9.c -flto -ffat-lto-objects
>> >> scan-tree-dump-not vect "LOOP VECTORIZED"
>> >> FAIL: gcc.dg/vect/pr65947-9.c scan-tree-dump-not vect "LOOP VECTORIZED"
>> >
>> > So the testcase gets vectorized now (for whatever reason) and still passes
>> > execution. Not sure why the testcase checked for not being vectorized.
>> >
>> > Alan?
>> >
>> > Richard.
>>
>> I’ve not looked at the new patch, but pr65947-9.c was added to test:
>>
>> + /* Condition reduction with maximum possible loop size. Will fail to
>> + vectorize because the vectorisation requires a slot for default values.
>> */
>>
>> So, in the pr65947-9.c, if nothing passes the IF clause, then LAST needs to
>> be
>> set to -72.
>
> So the runtime part of the testcase fails to test this case and we expect
> it to FAIL if vectorized?
>
> Index: testsuite/gcc.dg/vect/pr65947-9.c
> ===================================================================
> --- testsuite/gcc.dg/vect/pr65947-9.c (revision 249145)
> +++ testsuite/gcc.dg/vect/pr65947-9.c (working copy)
> @@ -34,9 +34,9 @@ main (void)
>
> check_vect ();
>
> - char ret = condition_reduction (a, 16);
> + char ret = condition_reduction (a, 1);
>
> - if (ret != 10)
> + if (ret != -72)
> abort ();
>
> return 0;
>
> On aarch64 I can reproduce the inline copy in main to be vectorized
> (doesn't happen on x86_64). niter analysis says:
>
> Analyzing # of iterations of loop 1
> exit condition [253, + , 4294967295] != 0
> bounds on difference of bases: -253 ... -253
> result:
> # of iterations 253, bounded by 253
> Analyzing # of iterations of loop 1
> exit condition [253, + , 4294967295] != 0
> bounds on difference of bases: -253 ... -253
> result:
> # of iterations 253, bounded by 253
> Statement (exit)if (ivtmp_45 != 0)
> is executed at most 253 (bounded by 253) + 1 times in loop 1.
>
> so it fits there. While the offline copy has
>
> Analyzing # of iterations of loop 1
> exit condition [254, + , 4294967295] != 0
> bounds on difference of bases: -254 ... -254
> result:
> # of iterations 254, bounded by 254
> Analyzing # of iterations of loop 1
> exit condition [254, + , 4294967295] != 0
> bounds on difference of bases: -254 ... -254
> result:
> # of iterations 254, bounded by 254
> Statement (exit)if (ivtmp_7 != 0)
> is executed at most 254 (bounded by 254) + 1 times in loop 1.
>
> we peeled one iteration (ch_loop does that) so we have the place
> left.
>
> Marking the function noinline works as a fix I guess.
>
> Tested on x86_64-unknown-linux-gnu, installed.
>
> Richard.
>
> 2017-06-20 Richard Biener <[email protected]>
>
> * gcc.dg/vect/pr65947-9.c: Adjust.
Hi,
After this change (r249400), the test fails on aarch64/arm:
FAIL: gcc.dg/vect/pr65947-9.c -flto -ffat-lto-objects
scan-tree-dump vect "loop size is greater than data size"
FAIL: gcc.dg/vect/pr65947-9.c scan-tree-dump vect "loop size is
greater than data size"
Christophe
>
> Index: gcc/testsuite/gcc.dg/vect/pr65947-9.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vect/pr65947-9.c (revision 249145)
> +++ gcc/testsuite/gcc.dg/vect/pr65947-9.c (working copy)
> @@ -9,10 +9,10 @@ extern void abort (void) __attribute__ (
> /* Condition reduction with maximum possible loop size. Will fail to
> vectorize because the vectorisation requires a slot for default values.
> */
>
> -char
> +signed char __attribute__((noinline,noclone))
> condition_reduction (char *a, char min_v)
> {
> - char last = -72;
> + signed char last = -72;
>
> for (int i = 0; i < N; i++)
> if (a[i] < min_v)
> @@ -21,10 +21,10 @@ condition_reduction (char *a, char min_v
> return last;
> }
>
> -char
> -main (void)
> +int
> +main ()
> {
> - char a[N] = {
> + signed char a[N] = {
> 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
> @@ -34,11 +34,14 @@ main (void)
>
> check_vect ();
>
> - char ret = condition_reduction (a, 16);
> -
> + signed char ret = condition_reduction (a, 16);
> if (ret != 10)
> abort ();
>
> + ret = condition_reduction (a, 1);
> + if (ret != -72)
> + abort ();
> +
> return 0;
> }
>