https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88494
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hjl.tools at gmail dot com --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- On Skylake it's better (1uop, 1 cycle latency) while on Ryzen even better. On Bulldozer it also isn't that bad (comparable to Skylake I guess). So for generic tuning leave it enabled but for Broadwell and earlier tuning disable it? Note I didn't actually try benchmarking on Skylake, Ryzen or Bulldozer.