x86 conditional branch (jcc) target can be either a label or a symbol.
Add a pass to fold tail call with jcc by turning:

        jcc     .L6
...
.L6:
        jmp     tailcall

into:

        jcc     tailcall

After basic block reordering pass, conditional branches look like

(jump_insn 7 6 14 2 (set (pc)
        (if_then_else (eq (reg:CCZ 17 flags)
                (const_int 0 [0]))
            (label_ref:DI 23)
            (pc))) "x.c":8:5 1458 {jcc}
     (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (int_list:REG_BR_PROB 217325348 (nil)))
...
(code_label 23 20 8 4 4 (nil) [1 uses])
(note 8 23 9 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(call_insn/j 9 8 10 4 (call (mem:QI (symbol_ref:DI ("bar") [flags 0x41]  <functi
on_decl 0x7f4cff3c0b00 bar>) [0 bar S1 A8])
        (const_int 0 [0])) "x.c":8:14 discrim 1 1469 {sibcall_di}
     (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar") [flags 0x41]  <function_dec
l 0x7f4cff3c0b00 bar>)
        (nil))
    (nil))

If the branch edge destination is a basic block with only a direct
sibcall, change the jcc target to the sibcall target and decrement
the destination basic block entry label use count.  Even though the
destination basic block is unused, it must be kept since it is required
by RTL control flow check and JUMP_LABEL of the conditional jump can
only point to a code label, not a code symbol.  Dummy sibcall patterns
are added so that sibcalls in basic blocks, whose entry label use count
is 0, won't be generated.

Jump tables like

foo:
        .cfi_startproc
        cmpl    $4, %edi
        ja      .L1
        movl    %edi, %edi
        jmp     *.L4(,%rdi,8)
        .section        .rodata
.L4:
        .quad   .L8
        .quad   .L7
        .quad   .L6
        .quad   .L5
        .quad   .L3
        .text
.L5:
        jmp     bar3
.L3:
        jmp     bar4
.L8:
        jmp     bar0
.L7:
        jmp     bar1
.L6:
        jmp     bar2
.L1:
        ret
        .cfi_endproc

can also be changed to:

foo:
        .cfi_startproc
        cmpl    $4, %edi
        ja      .L1
        movl    %edi, %edi
        jmp     *.L4(,%rdi,8)
        .section        .rodata
.L4:
        .quad   bar0
        .quad   bar1
        .quad   bar2
        .quad   bar3
        .quad   bar4
        .text
.L1:
        ret
        .cfi_endproc

After basic block reordering pass, jump tables look like:

(jump_table_data 16 15 17 (addr_vec:DI [
            (label_ref:DI 18)
            (label_ref:DI 22)
            (label_ref:DI 26)
            (label_ref:DI 30)
            (label_ref:DI 34)
        ]))
...
(code_label 30 17 31 4 5 (nil) [1 uses])
(note 31 30 32 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(call_insn/j 32 31 33 4 (call (mem:QI (symbol_ref:DI ("bar3") [flags 0x41]  
<function_decl 0x7f21be3c0e00 bar3>) [0 bar3 S1 A8])
        (const_int 0 [0])) "j.c":15:13 1469 {sibcall_di}
     (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar3") [flags 0x41]  
<function_decl 0x7f21be3c0e00 bar3>)
        (nil))
    (nil))

If the jump table entry points to a target basic block with only a direct
sibcall, change the entry to point to the sibcall target and decrement
the target basic block entry label use count.  If the target basic block
isn't kept for JUMP_LABEL of the conditional tailcall, delete it if its
entry label use count is 0.

Update final_scan_insn_1 to skip a label if its use count is 0 and
support symbol reference in jump table.  Update create_trace_edges to
skip symbol reference in jump table.

H.J. Lu (2):
  x86: Add a pass to fold tail call
  x86: Fold sibcall targets into jump table

 gcc/config/i386/i386-features.cc           | 274 +++++++++++++++++++++
 gcc/config/i386/i386-passes.def            |   1 +
 gcc/config/i386/i386-protos.h              |   3 +
 gcc/config/i386/i386.cc                    |  12 +
 gcc/config/i386/i386.md                    |  57 ++++-
 gcc/config/i386/predicates.md              |   4 +
 gcc/dwarf2cfi.cc                           |   7 +-
 gcc/final.cc                               |  26 +-
 gcc/testsuite/gcc.target/i386/pr14721-1a.c |  54 ++++
 gcc/testsuite/gcc.target/i386/pr14721-1b.c |  37 +++
 gcc/testsuite/gcc.target/i386/pr14721-1c.c |  37 +++
 gcc/testsuite/gcc.target/i386/pr14721-2a.c |  58 +++++
 gcc/testsuite/gcc.target/i386/pr14721-2b.c |  41 +++
 gcc/testsuite/gcc.target/i386/pr14721-2c.c |  43 ++++
 gcc/testsuite/gcc.target/i386/pr14721-3a.c |  56 +++++
 gcc/testsuite/gcc.target/i386/pr14721-3b.c |  40 +++
 gcc/testsuite/gcc.target/i386/pr14721-3c.c |  39 +++
 gcc/testsuite/gcc.target/i386/pr47253-1a.c |  24 ++
 gcc/testsuite/gcc.target/i386/pr47253-1b.c |  17 ++
 gcc/testsuite/gcc.target/i386/pr47253-2a.c |  27 ++
 gcc/testsuite/gcc.target/i386/pr47253-2b.c |  17 ++
 gcc/testsuite/gcc.target/i386/pr47253-3a.c |  32 +++
 gcc/testsuite/gcc.target/i386/pr47253-3b.c |  20 ++
 gcc/testsuite/gcc.target/i386/pr47253-3c.c |  20 ++
 gcc/testsuite/gcc.target/i386/pr47253-4a.c |  26 ++
 gcc/testsuite/gcc.target/i386/pr47253-4b.c |  18 ++
 gcc/testsuite/gcc.target/i386/pr47253-5.c  |  15 ++
 gcc/testsuite/gcc.target/i386/pr47253-6.c  |  15 ++
 gcc/testsuite/gcc.target/i386/pr47253-7a.c |  52 ++++
 gcc/testsuite/gcc.target/i386/pr47253-7b.c |  36 +++
 30 files changed, 1097 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-1a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-1b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-2b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-4b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-7a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-7b.c

-- 
2.48.1

Reply via email to