Sorry, it's my question. I still have some questions that I haven't understood, so I haven't replied to the email yet.:-(

在 2023/4/10 下午5:04, Xi Ruoyao 写道:
Ping.  Or maybe I've lost some replies here because my mail server
crashed several days ago :).

On Wed, 2023-03-29 at 02:01 +0800, Xi Ruoyao wrote:
LoongArch backend used to save all GARs for a function with variable
arguments.  But sometimes a function only accepts variable arguments
for
a purpose like C++ function overloading.  For example, POSIX defines
open() as:

     int open(const char *path, int oflag, ...);

But only two forms are actually used:

     int open(const char *pathname, int flags);
     int open(const char *pathname, int flags, mode_t mode);

So it's obviously a waste to save all 8 GARs in open().  We can use
the
cfun->va_list_gpr_size field set by the stdarg pass to only save the
GARs necessary to be saved.

If the va_list escapes (for example, in fprintf() we pass it to
vfprintf()), stdarg would set cfun->va_list_gpr_size to 255 so we
don't need a special case.

With this patch, only one GAR ($a2/$r6) is saved in open().  Ideally
even this stack store should be omitted too, but doing so is not
trivial
and AFAIK there are no compilers (for any target) performing the
"ideal"
optimization here, see https://godbolt.org/z/n1YqWq9c9.

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk
(GCC 14 or now)?

gcc/ChangeLog:

         * config/loongarch/loongarch.cc
         (loongarch_setup_incoming_varargs): Don't save more GARs than
         cfun->va_list_gpr_size / UNITS_PER_WORD.

gcc/testsuite/ChangeLog:

         * gcc.target/loongarch/va_arg.c: New test.
---
  gcc/config/loongarch/loongarch.cc           |  4 +++-
  gcc/testsuite/gcc.target/loongarch/va_arg.c | 24
+++++++++++++++++++++
  2 files changed, 27 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/va_arg.c

diff --git a/gcc/config/loongarch/loongarch.cc
b/gcc/config/loongarch/loongarch.cc
index 6927bdc7fe5..0ecb91ca997 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -764,7 +764,9 @@ loongarch_setup_incoming_varargs
(cumulative_args_t cum,
      loongarch_function_arg_advance (pack_cumulative_args
(&local_cum), arg);
   /* Found out how many registers we need to save.  */
-  gp_saved = MAX_ARGS_IN_REGISTERS - local_cum.num_gprs;
+  gp_saved = cfun->va_list_gpr_size / UNITS_PER_WORD;
+  if (gp_saved > (int) (MAX_ARGS_IN_REGISTERS - local_cum.num_gprs))
+    gp_saved = MAX_ARGS_IN_REGISTERS - local_cum.num_gprs;
   if (!no_rtl && gp_saved > 0)
      {
diff --git a/gcc/testsuite/gcc.target/loongarch/va_arg.c
b/gcc/testsuite/gcc.target/loongarch/va_arg.c
new file mode 100644
index 00000000000..980c96d0e3d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/va_arg.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Technically we shouldn't save any register for this function: it
should be
+   compiled as if it accepts 3 named arguments.  But AFAIK no
compilers can
+   achieve this "perfect" optimization now, so just ensure we are
using the
+   knowledge provided by stdarg pass and we won't save GARs
impossible to be
+   accessed with __builtin_va_arg () when the va_list does not
escape.  */
+
+/* { dg-final { scan-assembler-not "st.*r7" } } */
+
+int
+test (int a0, ...)
+{
+  void *arg;
+  int a1, a2;
+
+  __builtin_va_start (arg, a0);
+  a1 = __builtin_va_arg (arg, int);
+  a2 = __builtin_va_arg (arg, int);
+  __builtin_va_end (arg);
+
+  return a0 + a1 + a2;
+}

Reply via email to