https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538

--- Comment #16 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Christophe Lyon from comment #15)
> (In reply to Wilco from comment #14)
> > (In reply to Christophe Lyon from comment #11)
> > > (In reply to Wilco from comment #10)
> > 
> > > Right, but the code is functional.
> > 
> > It doesn't avoid the literal load from flash which is exactly what pure-code
> > and slow-flash-data is all about.
> 
> For f1 on M0, I can see:
>         .section        .rodata.cst4,"aM",%progbits,4
>         .align  2
> .LC0:
>         .word   .LANCHOR0
>         .section .text,"0x20000006",%progbits
> [...]
> f1:
>         movs    r3, #:upper8_15:#.LC0
>         lsls    r3, #8
>         adds    r3, #:upper0_7:#.LC0
>         lsls    r3, #8
>         adds    r3, #:lower8_15:#.LC0
>         lsls    r3, #8
>         adds    r3, #:lower0_7:#.LC0
>         ldr     r3, [r3]        @ 6     [c=10 l=2]  *thumb1_movsi_insn/8
>         ldr     r0, [r3]        @ 7     [c=10 l=2]  *thumb1_movsi_insn/8
>         bx      lr
> [...]
>         .bss
>         .align  2
>         .set    .LANCHOR0,. + 0
>         .type   x, %object
>         .size   x, 4
> x:
>         .space  4
> 
> So the 1st load is from .rodata.cst4 and the 2nd load is from bss, both of
> which do not have the purecode bit set (unlike .text). Isn't that OK?

No, it will create a lot of complaints and support queries due to the obvious
regressions. It goes against the definition of pure-code and slow-flash-data
which is to remove the literal loads. And given the sequence is already
inefficient, we should do everything to remove the indirection which increases
the codesize overhead by 75%...

Another aspect that needs to be checked is that GCC correctly spills addresses
and complex constants instead of rematerializing them. This is basic minimal
quality that one expects for a feature like this.

Reply via email to