https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538
--- Comment #11 from Christophe Lyon <clyon at gcc dot gnu.org> ---
(In reply to Wilco from comment #10)
>
> For example:
>
> int x;
> int f1 (void) { return x; }
>
> with eg. -O2 -mcpu=cortex-m0 -mpure-code I get:
>
> movs r3, #:upper8_15:#.LC1
> lsls r3, #8
> adds r3, #:upper0_7:#.LC1
> lsls r3, #8
> adds r3, #:lower8_15:#.LC1
> lsls r3, #8
> adds r3, #:lower0_7:#.LC1
> @ sp needed
> ldr r3, [r3]
> ldr r0, [r3, #40]
> bx lr
>
> That's an extra indirection through a literal... There should only be
> one ldr to read x.
Right, but the code is functional. I mentioned that problem when I submitted
the patch. I thought it was good to provide functionality and improve the
generated code later.
I wrote: "I haven't found yet how to make code for cortex-m0 apply upper/lower
relocations to "p" instead of .LC2. The current code looks functional, but
could be improved."
>
> Big switch tables are produced for any Thumb-1 core, however I would expect
> Cortex-m0/m23 versions to look almost identical to the Cortex-m3 one, and
> use a sequence of comparisons instead of tables.
>
> int f2 (int x, int y)
> {
> switch (x)
> {
> case 0: return y + 0;
> case 1: return y + 1;
> case 2: return y + 2;
> case 3: return y + 3;
> case 4: return y + 4;
> case 5: return y + 5;
> }
> return y;
> }
>
I believe this is expected: as I wrote in my commit message
"CASE_VECTOR_PC_RELATIVE is now false with -mpure-code, to avoid generating
invalid assembly code with differences from symbols from two different sections
(the difference cannot be computed by the assembler)."
Maybe there's a possibility to tune this to detect cases where we can do
better?
> Immediate generation for common cases seems to be screwed up:
>
> int f3 (void) { return 0x11000000; }
>
> -O2 -mcpu=cortex-m0 -mpure-code:
>
> movs r0, #17
> lsls r0, r0, #8
> lsls r0, r0, #8
> lsls r0, r0, #8
> bx lr
This is not optimal, but functional, right?
> This also regressed Cortex-m23 which previously generated:
>
> movs r0, #136
> lsls r0, r0, #21
> bx lr
> Similar regressions happen with other immediates:
>
> int f3 (void) { return 0x12345678; }
>
> -O2 -mcpu=cortex-m23 -mpure-code:
>
> movs r0, #86
> lsls r0, r0, #8
> adds r0, r0, #120
> movt r0, 4660
> bx lr
>
> Previously it was:
>
> movw r0, #22136
> movt r0, 4660
> bx lr
OK, I'll check how to fix that.
> Also relocations with a small offset should be handled within the
> relocation. I'd expect this to never generate an extra addition, let alone
> an extra literal pool entry:
>
> int arr[10];
> int *f4 (void) { return &arr[1]; }
>
> -O2 -mcpu=cortex-m3 -mpure-code generates the expected:
>
> movw r0, #:lower16:.LANCHOR0+4
> movt r0, #:upper16:.LANCHOR0+4
> bx lr
>
> -O2 -mcpu=cortex-m23 -mpure-code generates this:
>
> movw r0, #:lower16:.LANCHOR0
> movt r0, #:upper16:.LANCHOR0
> adds r0, r0, #4
> bx lr
For cortex-m23, I get the same code with and without -mpure-code.
>
> And cortex-m0 again inserts an extra literal load:
>
> movs r3, #:upper8_15:#.LC0
> lsls r3, #8
> adds r3, #:upper0_7:#.LC0
> lsls r3, #8
> adds r3, #:lower8_15:#.LC0
> lsls r3, #8
> adds r3, #:lower0_7:#.LC0
> ldr r0, [r3]
> adds r0, r0, #4
> bx lr
Yes, same problem as in f1()
So I think -mpure-code for v6m is not broken, but yes the generated code can be
improved. So this may not be relevant to this PR?