On Tue, Jan 23, 2024 at 11:00 PM H.J. Lu <[email protected]> wrote: > > Changes in v3: > > 1. Rebase against commit 02e68389494 > 2. Don't add call_no_callee_saved_registers to machine_function since > all callee-saved registers are properly clobbered by callee with > no_callee_saved_registers attribute. > The patch LGTM, it should be low risk since there's already no_caller_save_registers attribute, the patch just extends to no_callee_save_registers with the same approach. So if there's no objection(or any concerns) in the next couple days, I'm ok for the patch to be in GCC14 and backport.
> Changes in v2: > > 1. Rebase against commit f9df00340e3 > 2. Don't add redundant clobbered_registers check in ix86_expand_call. > > In some cases, there are no need to save callee-saved registers: > > 1. If a noreturn function doesn't throw nor support exceptions, it can > skip saving callee-saved registers. > > 2. When an interrupt handler is implemented by an assembly stub which does: > > 1. Save all registers. > 2. Call a C function. > 3. Restore all registers. > 4. Return from interrupt. > > it is completely unnecessary to save and restore any registers in the C > function called by the assembly stub, even if they would normally be > callee-saved. > > This patch set adds no_callee_saved_registers function attribute, which > is complementary to no_caller_saved_registers function attribute, to > classify x86 backend call-saved register handling type with > > 1. Default call-saved registers. > 2. No caller-saved registers with no_caller_saved_registers attribute. > 3. No callee-saved registers with no_callee_saved_registers attribute. > > Functions of no callee-saved registers won't save callee-saved registers. > If a noreturn function doesn't throw nor support exceptions, it is > classified as the no callee-saved registers type. > > With these changes, __libc_start_main in glibc 2.39, which is a noreturn > function, is changed from > > __libc_start_main: > endbr64 > push %r15 > push %r14 > mov %rcx,%r14 > push %r13 > push %r12 > push %rbp > mov %esi,%ebp > push %rbx > mov %rdx,%rbx > sub $0x28,%rsp > mov %rdi,(%rsp) > mov %fs:0x28,%rax > mov %rax,0x18(%rsp) > xor %eax,%eax > test %r9,%r9 > > to > > __libc_start_main: > endbr64 > sub $0x28,%rsp > mov %esi,%ebp > mov %rdx,%rbx > mov %rcx,%r14 > mov %rdi,(%rsp) > mov %fs:0x28,%rax > mov %rax,0x18(%rsp) > xor %eax,%eax > test %r9,%r9 > > In Linux kernel 6.7.0 on x86-64, do_exit is changed from > > do_exit: > endbr64 > call <do_exit+0x9> > push %r15 > push %r14 > push %r13 > push %r12 > mov %rdi,%r12 > push %rbp > push %rbx > mov %gs:0x0,%rbx > sub $0x28,%rsp > mov %gs:0x28,%rax > mov %rax,0x20(%rsp) > xor %eax,%eax > call *0x0(%rip) # <do_exit+0x39> > test $0x2,%ah > je <do_exit+0x8d3> > > to > > do_exit: > endbr64 > call <do_exit+0x9> > sub $0x28,%rsp > mov %rdi,%r12 > mov %gs:0x28,%rax > mov %rax,0x20(%rsp) > xor %eax,%eax > mov %gs:0x0,%rbx > call *0x0(%rip) # <do_exit+0x2f> > test $0x2,%ah > je <do_exit+0x8c9> > > I compared GCC master branch bootstrap and test times on a slow machine > with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 > with the backported patch. The performance data isn't precise since the > measurements were done on different days with different GCC sources under > different 6.6 kernel versions. > > GCC master branch build time in seconds: > > before after improvement > 30043.75user 30013.16user 0% > 1274.85system 1243.72system 2.4% > > GCC master branch test time in seconds (new tests added): > > before after improvement > 216035.90user 216547.51user 0 > 27365.51system 26658.54system 2.6% > > Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39. > Systems perform normally. > > > H.J. Lu (2): > x86: Add no_callee_saved_registers function attribute > x86: Don't save callee-saved registers in noreturn functions > > gcc/config/i386/i386-expand.cc | 52 +++++++++++++--- > gcc/config/i386/i386-options.cc | 61 +++++++++++++++---- > gcc/config/i386/i386.cc | 57 +++++++++++++---- > gcc/config/i386/i386.h | 16 ++++- > gcc/doc/extend.texi | 8 +++ > .../gcc.dg/torture/no-callee-saved-run-1a.c | 23 +++++++ > .../gcc.dg/torture/no-callee-saved-run-1b.c | 59 ++++++++++++++++++ > .../gcc.target/i386/no-callee-saved-1.c | 30 +++++++++ > .../gcc.target/i386/no-callee-saved-10.c | 46 ++++++++++++++ > .../gcc.target/i386/no-callee-saved-11.c | 11 ++++ > .../gcc.target/i386/no-callee-saved-12.c | 10 +++ > .../gcc.target/i386/no-callee-saved-13.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-14.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-15.c | 17 ++++++ > .../gcc.target/i386/no-callee-saved-16.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-17.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-18.c | 51 ++++++++++++++++ > .../gcc.target/i386/no-callee-saved-2.c | 30 +++++++++ > .../gcc.target/i386/no-callee-saved-3.c | 8 +++ > .../gcc.target/i386/no-callee-saved-4.c | 8 +++ > .../gcc.target/i386/no-callee-saved-5.c | 11 ++++ > .../gcc.target/i386/no-callee-saved-6.c | 12 ++++ > .../gcc.target/i386/no-callee-saved-7.c | 49 +++++++++++++++ > .../gcc.target/i386/no-callee-saved-8.c | 50 +++++++++++++++ > .../gcc.target/i386/no-callee-saved-9.c | 49 +++++++++++++++ > gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 ++++++++ > gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 ++++++ > gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 ++++++ > gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 ++++++ > .../gcc.target/i386/stack-check-17.c | 19 +++--- > 30 files changed, 775 insertions(+), 47 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a.c > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c > > -- > 2.43.0 > -- BR, Hongtao
