I figured the epilogue vectorization code doesn't compute the correct
number of iterations for the epilogue when peeling for gaps is in
effect.  This prevents epilogue vectorization in some cases and
given the code also sets nb_iterations_upper_bound, causes possible
wrong-code (I think we probably want to remove that code since it
should be redundant).

Boostrapped and tested on x86_64-unknown-linux-gnu, I also ran
SPEC 2k6 with epilogue vectorization enabled on a core-avx2
machine successfully.

Applied to trunk.
Richard.

2018-12-03  Richard Biener  <rguent...@suse.de>

        * tree-vect-loop.c (vect_transform_loop): Properly compute
        upper bound for the epilogue when doing epilogue vectorization.

Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c        (revision 266665)
+++ gcc/tree-vect-loop.c        (working copy)
@@ -8548,9 +8548,12 @@ vect_transform_loop (loop_vec_info loop_
        {
          unsigned int eiters
            = (LOOP_VINFO_INT_NITERS (loop_vinfo)
-              - LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo));
-         eiters = eiters % lowest_vf;
+              - LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
+              - LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo));
+         eiters
+           = eiters % lowest_vf + LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo);
          epilogue->nb_iterations_upper_bound = eiters - 1;
+         epilogue->any_upper_bound = true;
 
          unsigned int ratio;
          while (next_size < vector_sizes.length ()

Reply via email to