https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81614
--- Comment #8 from Jan Hubicka <hubicka at gcc dot gnu.org> --- I have tried the attached change at our periodic tester for haswell. It switches codegen to one similar for pentimpro (assuming that renaming happens on register parts as opposed to full registers). Relevant run is Oct 6, 2017 20:00 UTC of Czerny at https://gcc.opensuse.org/gcc-old/SPEC/CFP/sb-czerny-head-64-2006/recent.html and https://gcc.opensuse.org/gcc-old/SPEC/CINT/sb-czerny-head-64-2006/recent.html It seems spec neutral. Because it models more closely what happens, perhaps changing it makes sense? Index: x86-tune.def =================================================================== --- x86-tune.def (revision 253509) +++ x86-tune.def (working copy) @@ -48,7 +48,7 @@ over partial stores. For example preffer MOVZBL or MOVQ to load 8bit value over movb. */ DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY, "partial_reg_dependency", - m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT | m_INTEL + m_P4_NOCONA | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_AMD_MULTIPLE | m_GENERIC) /* X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY: This knob promotes all store @@ -467,20 +467,20 @@ In current implementation the partial register stalls are not eliminated very well - they can be introduced via subregs synthesized by combine and can happen in caller/callee saving sequences. */ -DEF_TUNE (X86_TUNE_PARTIAL_REG_STALL, "partial_reg_stall", m_PPRO) +DEF_TUNE (X86_TUNE_PARTIAL_REG_STALL, "partial_reg_stall", m_PPRO | m_CORE_ALL | m_INTEL) /* X86_TUNE_PROMOTE_QIMODE: When it is cheap, turn 8bit arithmetic to corresponding 32bit arithmetic. */ DEF_TUNE (X86_TUNE_PROMOTE_QIMODE, "promote_qimode", - ~m_PPRO) + ~(m_PPRO | m_CORE_ALL | m_INTEL)) /* X86_TUNE_PROMOTE_HI_REGS: Same, but for 16bit artihmetic. Again we avoid partial register stalls on PentiumPro targets. */ -DEF_TUNE (X86_TUNE_PROMOTE_HI_REGS, "promote_hi_regs", m_PPRO) +DEF_TUNE (X86_TUNE_PROMOTE_HI_REGS, "promote_hi_regs", m_PPRO | m_CORE_ALL | m_INTEL) /* X86_TUNE_HIMODE_MATH: Enable use of 16bit arithmetic. On PPro this flag is meant to avoid partial register stalls. */ -DEF_TUNE (X86_TUNE_HIMODE_MATH, "himode_math", ~m_PPRO) +DEF_TUNE (X86_TUNE_HIMODE_MATH, "himode_math", ~(m_PPRO | m_CORE_ALL | m_INTEL)) /* X86_TUNE_SPLIT_LONG_MOVES: Avoid instructions moving immediates directly to memory. */