https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359
--- Comment #32 from Aldy Hernandez <aldyh at gcc dot gnu.org> --- As mentioned in the previous comment, the proposed patch brings down the count from 116 to 108 on ARM, but is shy of the desired 96. The missing bytes can be attributed to forwprop folding this (IL expanded for illustration): if (ui_7 / 10 != 0) into: if (ui_7 > 9) More specifically, changing this: # ui_7 = PHI <ui_13(2), ui_21(3)> ... ui_21 = ui_7 / 10; if (ui_21 != 0) into: # ui_7 = PHI <ui_13(2), ui_21(3)> ... ui_21 = ui_7 / 10; if (ui_7 > 9) Inhibiting this optimization brings down the byte count to 92 which is even lower than our 96 boogie man, so perhaps worth pursuing. (Assumes my proposed patch is also applied.) I'm no expert, but isn't a EQ/NE with 0 preferable than a <> with a non-zero? If so, should we restrict the folding somewhat, or clean this up after the fact? For reference, the folding (in forwprop) is due to this match.pd pattern: /* X / C1 op C2 into a simple range test. */ ...though eliminating it causes another pattern to pick up the slack and do the same: /* Transform: * (X / Y) == 0 -> X < Y if X, Y are unsigned. * (X / Y) != 0 -> X >= Y, if X, Y are unsigned. */ Eliminating both patterns "fixes" the problem. Suggestions welcome :).