https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108712
Bug ID: 108712 Summary: Missing optimization with memory-barrier Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: klaus.doldinger64 at googlemail dot com Target Milestone: --- In the following example the increments of `g` could be optimized to a `g+=20` equivalent. But avr-gcc hoists the load of `g` outside the loop but the stores remain inside the loop. That produces unneccessary overhead. (I know that the idionatic solution would be to volatile-qualify the variable `flag` make a volatile-access like std::experimental::volatile_load()` or `ACCESS_ONCE()` (linux kernel). https://godbolt.org/z/1b6xG5YP4 ---- #include <stdint.h> #include <util/atomic.h> //#include <signal.h> // do not include that stub: wrong sig_atomic_t typedef signed char sig_atomic_t; // __SIG_ATOMIC_TYPE__ #include <avr/interrupt.h> static sig_atomic_t flag; static uint8_t g; void func(void) { for(uint8_t i = 0; i < 20; i++) { __asm__ __volatile__ ("" : "+m" (flag)); ++g; if (flag) { flag = 0; } __asm__ __volatile__ ("" : "+m" (flag)); } } ISR(USART_RXC_vect) { __asm__ __volatile__ ("" : "+m" (flag)); flag = 1; } ----