On Tue, Jan 23, 2024 at 11:00 PM H.J. Lu <[email protected]> wrote:
>
> Changes in v3:
>
> 1. Rebase against commit 02e68389494
> 2. Don't add call_no_callee_saved_registers to machine_function since
> all callee-saved registers are properly clobbered by callee with
> no_callee_saved_registers attribute.
>
The patch LGTM, it should be low risk since there's already
no_caller_save_registers attribute, the patch just extends to
no_callee_save_registers with the same approach.
So if there's no objection(or any concerns) in the next couple days,
I'm ok for the patch to be in GCC14 and backport.

> Changes in v2:
>
> 1. Rebase against commit f9df00340e3
> 2. Don't add redundant clobbered_registers check in ix86_expand_call.
>
> In some cases, there are no need to save callee-saved registers:
>
> 1. If a noreturn function doesn't throw nor support exceptions, it can
> skip saving callee-saved registers.
>
> 2. When an interrupt handler is implemented by an assembly stub which does:
>
>   1. Save all registers.
>   2. Call a C function.
>   3. Restore all registers.
>   4. Return from interrupt.
>
> it is completely unnecessary to save and restore any registers in the C
> function called by the assembly stub, even if they would normally be
> callee-saved.
>
> This patch set adds no_callee_saved_registers function attribute, which
> is complementary to no_caller_saved_registers function attribute, to
> classify x86 backend call-saved register handling type with
>
>   1. Default call-saved registers.
>   2. No caller-saved registers with no_caller_saved_registers attribute.
>   3. No callee-saved registers with no_callee_saved_registers attribute.
>
> Functions of no callee-saved registers won't save callee-saved registers.
> If a noreturn function doesn't throw nor support exceptions, it is
> classified as the no callee-saved registers type.
>
> With these changes, __libc_start_main in glibc 2.39, which is a noreturn
> function, is changed from
>
> __libc_start_main:
>         endbr64
>         push   %r15
>         push   %r14
>         mov    %rcx,%r14
>         push   %r13
>         push   %r12
>         push   %rbp
>         mov    %esi,%ebp
>         push   %rbx
>         mov    %rdx,%rbx
>         sub    $0x28,%rsp
>         mov    %rdi,(%rsp)
>         mov    %fs:0x28,%rax
>         mov    %rax,0x18(%rsp)
>         xor    %eax,%eax
>         test   %r9,%r9
>
> to
>
> __libc_start_main:
>         endbr64
>         sub    $0x28,%rsp
>         mov    %esi,%ebp
>         mov    %rdx,%rbx
>         mov    %rcx,%r14
>         mov    %rdi,(%rsp)
>         mov    %fs:0x28,%rax
>         mov    %rax,0x18(%rsp)
>         xor    %eax,%eax
>         test   %r9,%r9
>
> In Linux kernel 6.7.0 on x86-64, do_exit is changed from
>
> do_exit:
>         endbr64
>         call   <do_exit+0x9>
>         push   %r15
>         push   %r14
>         push   %r13
>         push   %r12
>         mov    %rdi,%r12
>         push   %rbp
>         push   %rbx
>         mov    %gs:0x0,%rbx
>         sub    $0x28,%rsp
>         mov    %gs:0x28,%rax
>         mov    %rax,0x20(%rsp)
>         xor    %eax,%eax
>         call   *0x0(%rip)        # <do_exit+0x39>
>         test   $0x2,%ah
>         je     <do_exit+0x8d3>
>
> to
>
> do_exit:
>         endbr64
>         call   <do_exit+0x9>
>         sub    $0x28,%rsp
>         mov    %rdi,%r12
>         mov    %gs:0x28,%rax
>         mov    %rax,0x20(%rsp)
>         xor    %eax,%eax
>         mov    %gs:0x0,%rbx
>         call   *0x0(%rip)        # <do_exit+0x2f>
>         test   $0x2,%ah
>         je     <do_exit+0x8c9>
>
> I compared GCC master branch bootstrap and test times on a slow machine
> with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13
> with the backported patch.  The performance data isn't precise since the
> measurements were done on different days with different GCC sources under
> different 6.6 kernel versions.
>
> GCC master branch build time in seconds:
>
> before                after                  improvement
> 30043.75user          30013.16user           0%
> 1274.85system         1243.72system          2.4%
>
> GCC master branch test time in seconds (new tests added):
>
> before                after                  improvement
> 216035.90user         216547.51user          0
> 27365.51system        26658.54system         2.6%
>
> Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39.
> Systems perform normally.
>
>
> H.J. Lu (2):
>   x86: Add no_callee_saved_registers function attribute
>   x86: Don't save callee-saved registers in noreturn functions
>
>  gcc/config/i386/i386-expand.cc                | 52 +++++++++++++---
>  gcc/config/i386/i386-options.cc               | 61 +++++++++++++++----
>  gcc/config/i386/i386.cc                       | 57 +++++++++++++----
>  gcc/config/i386/i386.h                        | 16 ++++-
>  gcc/doc/extend.texi                           |  8 +++
>  .../gcc.dg/torture/no-callee-saved-run-1a.c   | 23 +++++++
>  .../gcc.dg/torture/no-callee-saved-run-1b.c   | 59 ++++++++++++++++++
>  .../gcc.target/i386/no-callee-saved-1.c       | 30 +++++++++
>  .../gcc.target/i386/no-callee-saved-10.c      | 46 ++++++++++++++
>  .../gcc.target/i386/no-callee-saved-11.c      | 11 ++++
>  .../gcc.target/i386/no-callee-saved-12.c      | 10 +++
>  .../gcc.target/i386/no-callee-saved-13.c      | 16 +++++
>  .../gcc.target/i386/no-callee-saved-14.c      | 16 +++++
>  .../gcc.target/i386/no-callee-saved-15.c      | 17 ++++++
>  .../gcc.target/i386/no-callee-saved-16.c      | 16 +++++
>  .../gcc.target/i386/no-callee-saved-17.c      | 16 +++++
>  .../gcc.target/i386/no-callee-saved-18.c      | 51 ++++++++++++++++
>  .../gcc.target/i386/no-callee-saved-2.c       | 30 +++++++++
>  .../gcc.target/i386/no-callee-saved-3.c       |  8 +++
>  .../gcc.target/i386/no-callee-saved-4.c       |  8 +++
>  .../gcc.target/i386/no-callee-saved-5.c       | 11 ++++
>  .../gcc.target/i386/no-callee-saved-6.c       | 12 ++++
>  .../gcc.target/i386/no-callee-saved-7.c       | 49 +++++++++++++++
>  .../gcc.target/i386/no-callee-saved-8.c       | 50 +++++++++++++++
>  .../gcc.target/i386/no-callee-saved-9.c       | 49 +++++++++++++++
>  gcc/testsuite/gcc.target/i386/pr38534-1.c     | 26 ++++++++
>  gcc/testsuite/gcc.target/i386/pr38534-2.c     | 18 ++++++
>  gcc/testsuite/gcc.target/i386/pr38534-3.c     | 19 ++++++
>  gcc/testsuite/gcc.target/i386/pr38534-4.c     | 18 ++++++
>  .../gcc.target/i386/stack-check-17.c          | 19 +++---
>  30 files changed, 775 insertions(+), 47 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a.c
>  create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c
>
> --
> 2.43.0
>


-- 
BR,
Hongtao

Reply via email to