On 27/09/14 22:20, Kugan wrote:
On 23/09/14 01:58, Jiong Wang wrote:
On 22/09/14 16:43, Kugan wrote:
AArch64 has the same issue ARM had where the LR register was not used in
leaf functions. This was reported in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42017. In AArch64, this
test-case need to be added with more live ranges for the need for the
LR_REGNUM. i.e test-case in the PR needs additional loops up to r31 for
the case AArch64 to see this.
The same fix (from the thread
https://gcc.gnu.org/ml/gcc-patches/2011-04/msg02191.html) which went
into ARM should apply to AArch64 as well. Regression tested on qemu for
aarch64-none-linux-gnu with no new regressions. Is this OK for trunk?
This still be a partial fix. LR should be a caller-saved register free
to use in case it's saved properly to across function call.
Indeed. This should be improved from the generic code. Right now, if a
hard register is used in EPILOGUE_USES, it conflicts with all the live
ranges till a call site kills. I think we should have this patch till
the generic code can be improved.
below is my local patch. LR is treated as free register, and strictly
following AArch64 ABI, frame should always be created, FP maintained
properly if LR clobbered under -fno-omit-frame-pointer.
gcc/
* config/aarch64/aarch64.h (CALL_USED_REGISTERS): Mark LR as caller-save.
(EPILOGUE_USES): Guard the check by epilogue_completed.
* config/aarch64/aarch64.c (aarch64_layout_frame): Explictly check for LR.
(aarch64_can_eliminate): Check LR_REGNUM liveness.
gcc/testsuite/
* gcc.target/aarch64/lr_free_1.c: New testcase for -fomit-frame-pointer.
* gcc.target/aarch64/lr_free_2.c: New testcase for leaf
-fno-omit-frame-pointer.
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index db950da..892b310 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -250,7 +250,7 @@ extern unsigned long aarch64_tune_flags;
1, 1, 1, 1, 1, 1, 1, 1, /* R0 - R7 */ \
1, 1, 1, 1, 1, 1, 1, 1, /* R8 - R15 */ \
1, 1, 1, 0, 0, 0, 0, 0, /* R16 - R23 */ \
- 0, 0, 0, 0, 0, 1, 0, 1, /* R24 - R30, SP */ \
+ 0, 0, 0, 0, 0, 1, 1, 1, /* R24 - R30, SP */ \
1, 1, 1, 1, 1, 1, 1, 1, /* V0 - V7 */ \
0, 0, 0, 0, 0, 0, 0, 0, /* V8 - V15 */ \
1, 1, 1, 1, 1, 1, 1, 1, /* V16 - V23 */ \
@@ -309,7 +309,7 @@ extern unsigned long aarch64_tune_flags;
considered live at the start of the called function. */
#define EPILOGUE_USES(REGNO) \
- ((REGNO) == LR_REGNUM)
+ (epilogue_completed && (REGNO) == LR_REGNUM)
/* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
the stack pointer does not matter. The value is tested only in
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 15c7be6..8b39b2a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1864,7 +1864,8 @@ aarch64_layout_frame (void)
/* ... and any callee saved register that dataflow says is live. */
for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
if (df_regs_ever_live_p (regno)
- && !call_used_regs[regno])
+ && (regno == R30_REGNUM
+ || !call_used_regs[regno]))
cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
@@ -4313,6 +4314,16 @@ aarch64_can_eliminate (const int from, const int to)
return false;
}
+ else
+ {
+ /* If we decided that we didn't need a leaf frame pointer but then used
+ LR in the function, then we'll want a frame pointer after all, so
+ prevent this elimination to ensure a frame pointer is used. */
+ if (to == STACK_POINTER_REGNUM
+ && flag_omit_leaf_frame_pointer
+ && df_regs_ever_live_p (LR_REGNUM))
+ return false;
+ }
return true;
}
diff --git a/gcc/testsuite/gcc.target/aarch64/lr_free_1.c b/gcc/testsuite/gcc.target/aarch64/lr_free_1.c
new file mode 100644
index 0000000..4c530a2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lr_free_1.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options "-fno-inline -O2 -fomit-frame-pointer -ffixed-x2 -ffixed-x3 -ffixed-x4 -ffixed-x5 -ffixed-x6 -ffixed-x7 -ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11 -ffixed-x12 -ffixed-x13 -ffixed-x14 -ffixed-x15 -ffixed-x16 -ffixed-x17 -ffixed-x18 -ffixed-x19 -ffixed-x20 -ffixed-x21 -ffixed-x22 -ffixed-x23 -ffixed-x24 -ffixed-x25 -ffixed-x26 -ffixed-x27 -ffixed-28 -ffixed-29 --save-temps -mgeneral-regs-only -fno-ipa-cp" } */
+
+extern void abort ();
+
+int
+dec (int a, int b)
+{
+ return a + b;
+}
+
+int
+cal (int a, int b)
+{
+ int sum1 = a * b;
+ int sum2 = a / b;
+ int sum = dec (sum1, sum2);
+ return a + b + sum + sum1 + sum2;
+}
+
+int
+main (int argc, char **argv)
+{
+ int ret = cal (2, 1);
+
+ if (ret != 11)
+ abort ();
+
+ return 0;
+}
+
+/* { dg-final { scan-assembler-times "str\tx30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */
+/* { dg-final { scan-assembler "str\tw30, \\\[sp, \[0-9\]+\\\]" } } */
+
+/* { dg-final { scan-assembler "ldr\tw30, \\\[sp, \[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-times "ldr\tx30, \\\[sp\\\], \[0-9\]+" 2 } } */
+
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/lr_free_2.c b/gcc/testsuite/gcc.target/aarch64/lr_free_2.c
new file mode 100644
index 0000000..2bfb6ad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lr_free_2.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-fno-inline -O2 -ffixed-x2 -ffixed-x3 -ffixed-x4 -ffixed-x5 -ffixed-x6 -ffixed-x7 -ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11 -ffixed-x12 -ffixed-x13 -ffixed-x14 -ffixed-x15 -ffixed-x16 -ffixed-x17 -ffixed-x18 -ffixed-x19 -ffixed-x20 -ffixed-x21 -ffixed-x22 -ffixed-x23 -ffixed-x24 -ffixed-x25 -ffixed-x26 -ffixed-x27 -ffixed-x28 --save-temps -mgeneral-regs-only -fno-ipa-cp -fdump-rtl-ira" } */
+
+extern void abort ();
+
+int
+cal (int a, int b)
+{
+ /* { dg-final { scan-assembler-times "stp\tx29, x30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */
+ int sum = a + b;
+ int sum1 = a * b;
+ /* { dg-final { scan-assembler-times "ldr\tx29, x30, \\\[sp\\\], \[0-9\]+" 2 } } */
+ /* { dg-final { scan-rtl-dump "assign reg 30" "ira" } } */
+ return (a + b + sum + sum1);
+}
+
+int
+main (int argc, char **argv)
+{
+ int ret = cal (1, 2);
+
+ if (ret != 8)
+ abort ();
+
+ return 0;
+}
+
+/* { dg-final { cleanup-saved-temps } } */
+/* { dg-final { cleanup-rtl-dump "ira" } } */