https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91400
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Target| |x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed| |2019-08-09 Component|tree-optimization |target Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Confirmed. The issue is that the target builds a new(!) VAR_DECL for __cpu_mode for _each_(!) invocation of __builtin_cpu_supports so the GIMPLE level optimizers have no chance to CSE the loads. The two VAR_DECLs for this testcase are <var_decl 0x7ffff7fefb40 __cpu_model type <record_type 0x7ffff69c8888 __processor_model> public static external weak preserve BLK (null):0:0 size <integer_cst 0x7ffff69ade10 constant 160> unit-size <integer_cst 0x7ffff69ade40 constant 20> align:32 warn_if_not_align:0> <var_decl 0x7ffff7fefab0 __cpu_model type <record_type 0x7ffff69c8498 __processor_model> public static external weak preserve BLK (null):0:0 size <integer_cst 0x7ffff69ade10 constant 160> unit-size <integer_cst 0x7ffff69ade40 constant 20> align:32 warn_if_not_align:0> so you can see that we even re-build the RECORD_TYPE from scratch... :/