https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65443

vries at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35078|0                           |1
        is obsolete|                            |

--- Comment #10 from vries at gcc dot gnu.org ---
Created attachment 35092
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35092&action=edit
WIP patch

Updated patch which fixes probability/frequency. The generated code for the
loopfn is now identical at the optimized dump (previously we were sinking loads
into the loop nest due to the broken probability/frequency). 

The main difference in generated code at the optimized dump is this:
...
   <bb 5>:
+  n_24 = n_5(D);
   .paral_data_store.6.a = &a;
   .paral_data_store.6.b = &b;
   .paral_data_store.6.c = &c;
-  .paral_data_store.6.D.1854 = _12;
+  .paral_data_store.6.D.1854 = n_5(D);
   __builtin_GOMP_parallel (f._loopfn.0, &.paral_data_store.6, 2, 0);
-  ivtmp_27 = (signed int) _12;
-  _29 = a[ivtmp_27];
-  _30 = b[ivtmp_27];
-  _31 = _29 + _30;
-  c[ivtmp_27] = _31;
...

That is, we up the number of iterations with one (from _n - 1 to n), and remove
the peeled-off last loop iteration (the code after the
__builtin_GOMP_parallel).

Reply via email to