https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117253
--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> --- /* A C expression for the cost of a branch instruction. A value of 1 is the default; other values are interpreted relative to that. */ #define BRANCH_COST(speed_p, predictable_p) \ (!(speed_p) ? 2 : (predictable_p) ? 0 : ix86_branch_cost) So aarch64 uses 3 when optimizing for size. I wonder if 3 would be better when !speed for x86 too.