https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108712
Bug ID: 108712
Summary: Missing optimization with memory-barrier
Product: gcc
Version: 12.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: klaus.doldinger64 at googlemail dot com
Target Milestone: ---
In the following example the increments of `g` could be optimized to a `g+=20`
equivalent.
But avr-gcc hoists the load of `g` outside the loop but the stores remain
inside the loop.
That produces unneccessary overhead.
(I know that the idionatic solution would be to volatile-qualify the variable
`flag`
make a volatile-access like std::experimental::volatile_load()` or
`ACCESS_ONCE()` (linux kernel).
https://godbolt.org/z/1b6xG5YP4
----
#include <stdint.h>
#include <util/atomic.h>
//#include <signal.h> // do not include that stub: wrong sig_atomic_t
typedef signed char sig_atomic_t; // __SIG_ATOMIC_TYPE__
#include <avr/interrupt.h>
static sig_atomic_t flag;
static uint8_t g;
void func(void) {
for(uint8_t i = 0; i < 20; i++) {
__asm__ __volatile__ ("" : "+m" (flag));
++g;
if (flag) {
flag = 0;
}
__asm__ __volatile__ ("" : "+m" (flag));
}
}
ISR(USART_RXC_vect) {
__asm__ __volatile__ ("" : "+m" (flag));
flag = 1;
}
----