https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54816
Wilhelm M <klaus.doldinger64 at googlemail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |klaus.doldinger64@googlemai | |l.com --- Comment #1 from Wilhelm M <klaus.doldinger64 at googlemail dot com> --- The following code has the same problem: #include <avr/io.h> #include <stdint.h> uint16_t b; uint8_t a; template<typename A, typename B> B Mul(const A a, const B b) { static constexpr uint8_t shift = (sizeof(B) - sizeof(A)) * 8; return static_cast<A>(b >> shift) * a ; } int main() { return Mul(a, b); } with 4.6.4. it produces: main: lds r24,a lds r25,b+1 mul r25,r24 movw r24,r0 clr r1 ret with actual 12.2 it produces missing optimization: main: lds r24,b+1 ldi r25,0 lds r18,a movw r20,r24 mul r18,r20 movw r24,r0 mul r18,r21 add r25,r0 clr __zero_reg__ ret Interistingly the follwing code produces optimal code also with 12.2: template<typename A, typename B> B MulX(const A a, const B b) { static const uint8_t shift = (sizeof(B) - sizeof(A)) * 8; return static_cast<A>((b >> shift) + 1) * a ; }