https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538
--- Comment #16 from Wilco <wilco at gcc dot gnu.org> --- (In reply to Christophe Lyon from comment #15) > (In reply to Wilco from comment #14) > > (In reply to Christophe Lyon from comment #11) > > > (In reply to Wilco from comment #10) > > > > > Right, but the code is functional. > > > > It doesn't avoid the literal load from flash which is exactly what pure-code > > and slow-flash-data is all about. > > For f1 on M0, I can see: > .section .rodata.cst4,"aM",%progbits,4 > .align 2 > .LC0: > .word .LANCHOR0 > .section .text,"0x20000006",%progbits > [...] > f1: > movs r3, #:upper8_15:#.LC0 > lsls r3, #8 > adds r3, #:upper0_7:#.LC0 > lsls r3, #8 > adds r3, #:lower8_15:#.LC0 > lsls r3, #8 > adds r3, #:lower0_7:#.LC0 > ldr r3, [r3] @ 6 [c=10 l=2] *thumb1_movsi_insn/8 > ldr r0, [r3] @ 7 [c=10 l=2] *thumb1_movsi_insn/8 > bx lr > [...] > .bss > .align 2 > .set .LANCHOR0,. + 0 > .type x, %object > .size x, 4 > x: > .space 4 > > So the 1st load is from .rodata.cst4 and the 2nd load is from bss, both of > which do not have the purecode bit set (unlike .text). Isn't that OK? No, it will create a lot of complaints and support queries due to the obvious regressions. It goes against the definition of pure-code and slow-flash-data which is to remove the literal loads. And given the sequence is already inefficient, we should do everything to remove the indirection which increases the codesize overhead by 75%... Another aspect that needs to be checked is that GCC correctly spills addresses and complex constants instead of rematerializing them. This is basic minimal quality that one expects for a feature like this.