https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65371
--- Comment #2 from Stuart <gcc-bugzilla at enginuities dot com> --- I compiled it for x86_64 and thought it was fine, however, after your comment I tried compiling it with clang/llvm and can see the difference (I'm not particularly familiar with the full instruction set)... I've found another case which could also be improved: func.c: #include <stdint.h> #define PERIPH_BASE 0x40000000 #define PERIPH ((PERIPH_TypeDef *) PERIPH_BASE) typedef struct { volatile uint32_t REG1; } PERIPH_TypeDef; void func(uint16_t a) { uint32_t t = PERIPH->REG1; while ((uint16_t) (PERIPH->REG1 - t) < a) { } } gives: 00000000 <func>: 0: f04f 4380 mov.w r3, #1073741824 ; 0x40000000 4: 461a mov r2, r3 6: 6819 ldr r1, [r3, #0] 8: 6813 ldr r3, [r2, #0] a: 1a5b subs r3, r3, r1 c: b29b uxth r3, r3 e: 4283 cmp r3, r0 10: d3fa bcc.n 8 <func+0x8> 12: 4770 bx lr For some reason r3 is moved in to r2 and then value at the address in r2 is loaded in to r3 for the loop! I would expect the following: 00000000 <func>: 0: f04f 4180 mov.w r1, #1073741824 ; 0x40000000 4: 680a ldr r2, [r1, #0] 6: 680b ldr r3, [r1, #0] 8: 1a9b subs r3, r3, r2 a: b29b uxth r3, r3 c: 4283 cmp r3, r0 e: d3fa bcc.n 6 <func+0x6> 10: 4770 bx lr