https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115845
Bug ID: 115845 Summary: 25% runtime regression of 527.cam4_r when enabling --param vect-partial-vector-usage={1,2} ontop of -Ofast --march=znver4 Product: gcc Version: 14.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- There's a lot of cases like below │7c40:┌─ lea (%rdi,%rax,1),%rbx ▒ │ │tau_w_f(1:ncol,1:pver,:) = tau_w_f(1:ncol,1:pver,:) + twf(1:ncol,:,:) ▒ 68 │ │ vmovupd (%rsi,%rax,1),%zmm14{%k1} ▒ │ │ add $0x40,%rax ▒ │ │ vmovupd (%rbx),%zmm17{%k1} ▒ 11165 │ │ vaddpd %zmm17,%zmm14,%zmm20 ▒ 864 │ │ vmovupd %zmm20,(%rbx){%k1} ◆ │ │ mov %r11d,%ebx ▒ │ │ vpbroadcastw %r11d,%xmm20 ▒ 5 │ │ sub $0x8,%r11d ▒ │ │ add $0x8,%ebx ▒ │ │ vpcmpnleuw %xmm1,%xmm20,%k1 ▒ 1 │ │ cmp $0x8,%bx ▒ 89 │ └──ja 7c40 resulting in Samples: 1M of event 'cycles:u', Event count (approx.): 1356741812802 2 Overhead Samples Command Shared Object Symbol 7.02% 79632 cam4_r_peak.gcc cam4_r_peak.gcc7-m64 [.] __aer_rad_props_MOD_aer_rad_props_sw ◆ 3.96% 43265 cam4_r_peak.gcc cam4_r_peak.gcc7-m64 [.] __tracer_data_MOD_interpolate_trcdata.constprop.0 ▒ 3.09% 34998 cam4_r_peak.gcc cam4_r_peak.gcc7-m64 [.] __radsw_MOD_radcswmx ▒ 2.94% 32597 cam4_r_base.gcc libm-2.31.so [.] __ieee754_log_fma ▒ 2.70% 30823 cam4_r_peak.gcc libm-2.31.so [.] __ieee754_log_fma ▒ 2.68% 29978 cam4_r_base.gcc cam4_r_base.gcc7-m64 [.] __radsw_MOD_radcswmx ▒ 2.28% 24998 cam4_r_base.gcc cam4_r_base.gcc7-m64 [.] __tracer_data_MOD_interpolate_trcdata.constprop.0 ▒ 2.21% 25215 cam4_r_peak.gcc cam4_r_peak.gcc7-m64 [.] __radae_MOD_radabs ▒ 2.07% 22878 cam4_r_base.gcc cam4_r_base.gcc7-m64 [.] __radae_MOD_radabs ▒ 1.77% 20098 cam4_r_peak.gcc cam4_r_peak.gcc7-m64 [.] __zm_conv_MOD_ientropy.isra.0 ▒ 1.62% 18145 cam4_r_base.gcc cam4_r_base.gcc7-m64 [.] __aer_rad_props_MOD_aer_rad_props_sw (topmost and bottom most entries are the same functions peak/base). It almost feels like fault suppression kicking in, but on loads?!