[Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644

rguenth at gcc dot gnu.org Wed, 07 Feb 2018 07:47:03 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037


--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> ---
So after r257453 we improve the situation pre-IVOPTs to just
6 IVs (duplicated but trivially equivalent) plus one counting IV.  But then
when SLP is enabled IVOPTs comes along and adds another 4 IVs which makes us
spill... (for AVX256, so you need -march=core-avx2 for example).

Bin, any chance you can take a look?  In the IVO dump I see

  target_avail_regs 15
  target_clobbered_regs 9
  target_reg_cost 4
  target_spill_cost 8
  regs_used 3
^^^

and regs_used looks awfully low to me.  The loop has even more IVs initially
plus variable steps for that IVs which means we need two regs per IV.

There doesn't seem to be a way to force IVOPTs to use the minimal set of IVs?
Or just use the original set, removing the obvious redundancies?  There is
a microarchitectural issue left with the vectorization but the spilling
obscures the look quite a bit :/

[Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644

Reply via email to