On Sat, Feb 1, 2025 at 6:33 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Sat, Feb 1, 2025 at 5:52 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > On Sat, Feb 1, 2025 at 9:51 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > If TARGET_INDIRECT_BRANCH_REGISTER is true, indirect call and jump should
> > > use register, not memory.  Update Bs, Bw and Bz constraints to disable
> > > indirect call over memmory if TARGET_INDIRECT_BRANCH_REGISTER true, change
> > > x32 call over GOT slot to call over register and also disable sibcall
> > > over memory.
> > >
> > > gcc/
> > >
> > >         PR target/118713
> > >         * config/i386/constraints.md (Bs): Always disable if
> > >         TARGET_INDIRECT_BRANCH_REGISTER is true.
> > >         (Bw): Likewise.
> > >         * config/i386/i386-expand.cc (ix86_expand_call): Force indirect
> > >         call via register for GOT slot call if
> > >         TARGET_INDIRECT_BRANCH_REGISTER is true.
> > >         * config/i386/i386-protos.h (ix86_nopic_noplt_attribute_p): New.
> > >         * config/i386/i386.cc (ix86_nopic_noplt_attribute_p): Make it
> > >         global.
> > >         * config/i386/i386.md (*call_got_x32): Disable indirect call via
> > >         memory for TARGET_INDIRECT_BRANCH_REGISTER.
> > >         (*call_value_got_x32): Likewise.
> > >         (*sibcall_value_pop_memory): Likewise.
> > >         * config/i386/predicates.md (constant_call_address_operand):
> > >         Return false if both TARGET_INDIRECT_BRANCH_REGISTER and
> > >         ix86_nopic_noplt_attribute_p are true.
> >
> > LGTM, with some code reshuffling, proposed below. I didn't do a
> > thorough review on how T_I_B_R is handled, you are the expert in the
> > area of symbol handling.
> >
> > Thanks,
> > Uros.
> >
> > >
> > > gcc/testsuite/
> > >
> > >         PR target/118713
> > >         * gcc.target/i386/pr118713-1-x32.c: New test.
> > >         * gcc.target/i386/pr118713-1.c: Likewise.
> > >         * gcc.target/i386/pr118713-2-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-2.c: Likewise.
> > >         * gcc.target/i386/pr118713-3-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-3.c: Likewise.
> > >         * gcc.target/i386/pr118713-4-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-4.c: Likewise.
> > >         * gcc.target/i386/pr118713-5-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-5.c: Likewise.
> > >         * gcc.target/i386/pr118713-6-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-6.c: Likewise.
> > >         * gcc.target/i386/pr118713-7-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-7.c: Likewise.
> > >         * gcc.target/i386/pr118713-8-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-8.c: Likewise.
> > >         * gcc.target/i386/pr118713-9-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-9.c: Likewise.
> > >         * gcc.target/i386/pr118713-10-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-10.c: Likewise.
> > >         * gcc.target/i386/pr118713-11-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-11.c: Likewise.
> > >         * gcc.target/i386/pr118713-12-x32.c: Likewise.
> > >         * gcc.target/i386/pr118713-12.c: Likewise.
> > >
> > > Signed-off-by: H.J. Lu <hjl.to...@gmail.com>
> > > ---
> > >  gcc/config/i386/constraints.md                | 24 +++++++++----------
> > >  gcc/config/i386/i386-expand.cc                | 22 +++++++++++------
> > >  gcc/config/i386/i386-protos.h                 |  1 +
> > >  gcc/config/i386/i386.cc                       |  2 +-
> > >  gcc/config/i386/i386.md                       |  6 ++---
> > >  gcc/config/i386/predicates.md                 |  4 +++-
> > >  .../gcc.target/i386/pr118713-1-x32.c          |  8 +++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-1.c    | 14 +++++++++++
> > >  .../gcc.target/i386/pr118713-10-x32.c         |  8 +++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-10.c   | 15 ++++++++++++
> > >  .../gcc.target/i386/pr118713-11-x32.c         |  8 +++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-11.c   | 14 +++++++++++
> > >  .../gcc.target/i386/pr118713-12-x32.c         |  8 +++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-12.c   | 14 +++++++++++
> > >  .../gcc.target/i386/pr118713-2-x32.c          |  8 +++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-2.c    | 15 ++++++++++++
> > >  .../gcc.target/i386/pr118713-3-x32.c          |  8 +++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-3.c    | 14 +++++++++++
> > >  .../gcc.target/i386/pr118713-4-x32.c          |  8 +++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-4.c    | 14 +++++++++++
> > >  .../gcc.target/i386/pr118713-5-x32.c          |  7 ++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-5.c    | 12 ++++++++++
> > >  .../gcc.target/i386/pr118713-6-x32.c          |  7 ++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-6.c    | 13 ++++++++++
> > >  .../gcc.target/i386/pr118713-7-x32.c          |  7 ++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-7.c    | 12 ++++++++++
> > >  .../gcc.target/i386/pr118713-8-x32.c          |  7 ++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-8.c    | 12 ++++++++++
> > >  .../gcc.target/i386/pr118713-9-x32.c          |  8 +++++++
> > >  gcc/testsuite/gcc.target/i386/pr118713-9.c    | 14 +++++++++++
> > >  30 files changed, 290 insertions(+), 24 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-1-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-10-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-10.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-11-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-11.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-12-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-12.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-2-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-3-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-4-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-4.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-5-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-5.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-6-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-6.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-7-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-7.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-8-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-8.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-9-x32.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118713-9.c
> > >
> > > diff --git a/gcc/config/i386/constraints.md 
> > > b/gcc/config/i386/constraints.md
> > > index 9dd280568bd..d3d6e9e66f4 100644
> > > --- a/gcc/config/i386/constraints.md
> > > +++ b/gcc/config/i386/constraints.md
> > > @@ -203,21 +203,21 @@ (define_special_memory_constraint "Br"
> > >
> > >  (define_constraint "Bs"
> > >    "@internal Sibcall memory operand."
> > > -  (ior (and (not (match_test "TARGET_INDIRECT_BRANCH_REGISTER"))
> > > -           (not (match_test "TARGET_X32"))
> > > -           (match_operand 0 "sibcall_memory_operand"))
> > > -       (and (match_test "TARGET_X32")
> > > -           (match_test "Pmode == DImode")
> > > -           (match_operand 0 "GOT_memory_operand"))))
> > > +  (and (match_test "!TARGET_INDIRECT_BRANCH_REGISTER")
> > > +       (ior (and (match_test "!TARGET_X32")
> > > +                 (match_operand 0 "sibcall_memory_operand"))
> > > +            (and (match_test "TARGET_X32")
> > > +                 (match_test "Pmode == DImode")
> > > +                 (match_operand 0 "GOT_memory_operand")))))
> >
> > While here, the above can use if_then_else RTX (and not RTX for
> > consistency), like:
> >
> >   (and (not (match_test "TARGET_INDIRECT_BRANCH_REGISTER"))
> >        (if_then_else (match_test "TARGET_X32")
> >             (and (match_test "Pmode == DImode")
> >                     (match_operand 0 "GOT_memory_operand"))
> >             (match_operand 0 "sibcall_memory_operand")))
>
> Will fix it.
>
> > >  (define_constraint "Bw"
> > >    "@internal Call memory operand."
> > > -  (ior (and (not (match_test "TARGET_INDIRECT_BRANCH_REGISTER"))
> > > -           (not (match_test "TARGET_X32"))
> > > -           (match_operand 0 "memory_operand"))
> > > -       (and (match_test "TARGET_X32")
> > > -           (match_test "Pmode == DImode")
> > > -           (match_operand 0 "GOT_memory_operand"))))
> > > +  (and (match_test "!TARGET_INDIRECT_BRANCH_REGISTER")
> > > +       (ior (and (match_test "!TARGET_X32")
> > > +                 (match_operand 0 "memory_operand"))
> > > +            (and (match_test "TARGET_X32")
> > > +                 (match_test "Pmode == DImode")
> > > +                 (match_operand 0 "GOT_memory_operand")))))
> >
> > Also here.
>
> Will fix it.
>
> >
> > >  (define_constraint "Bz"
> > >    "@internal Constant call address operand."
> > > diff --git a/gcc/config/i386/i386-expand.cc 
> > > b/gcc/config/i386/i386-expand.cc
> > > index 117f6f6f7eb..7a2df43e7cd 100644
> > > --- a/gcc/config/i386/i386-expand.cc
> > > +++ b/gcc/config/i386/i386-expand.cc
> > > @@ -10225,13 +10225,21 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
> > > callarg1,
> > >      fnaddr = gen_rtx_MEM (QImode, construct_plt_address (XEXP (fnaddr, 
> > > 0)));
> > >    /* Since x32 GOT slot is 64 bit with zero upper 32 bits, indirect
> > >       branch via x32 GOT slot is OK.  */
> > > -  else if (!(TARGET_X32
> > > -            && MEM_P (fnaddr)
> > > -            && GET_CODE (XEXP (fnaddr, 0)) == ZERO_EXTEND
> > > -            && GOT_memory_operand (XEXP (XEXP (fnaddr, 0), 0), Pmode))
> > > -          && (sibcall
> > > -              ? !sibcall_insn_operand (XEXP (fnaddr, 0), word_mode)
> > > -              : !call_insn_operand (XEXP (fnaddr, 0), word_mode)))
> > > +  if (TARGET_X32
> >
> > Please keep "else if" here. construct_plt_address always returns REG,
> > which is universally accepted as a call operand.
> >
> > > +      && MEM_P (fnaddr)
> > > +      && GET_CODE (XEXP (fnaddr, 0)) == ZERO_EXTEND
> > > +      && GOT_memory_operand (XEXP (XEXP (fnaddr, 0), 0), Pmode))
> >
> > The purpose of this condition is to bypass the "else if" below. If you
> > write it as:
> >
> > else if (TARGET_X32
> >       && MEM_P (fnaddr)
> >       && GET_CODE (XEXP (fnaddr, 0)) == ZERO_EXTEND
> >       && GOT_memory_operand (XEXP (XEXP (fnaddr, 0), 0), Pmode)
> >       && !TARGET_INDIRECT_BRANCH_REGISTER)
> >         ;
> > else if (sibcall ...
> >
> > Then bypass won't be active, and "else if (sibcall ...)" will trigger,
> > copying the address to the register in the exact same way as your
> > newly added code.
>
> Will fix it.
>
> >
> > > +       {
> > > +         fnaddr = convert_to_mode (word_mode, XEXP (fnaddr, 0), 1);
> > > +         fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode,
> > > +                                                         fnaddr));
> > > +       }
> > > +    }
> > > +  else if (sibcall
> > > +          ? !sibcall_insn_operand (XEXP (fnaddr, 0), word_mode)
> > > +          : !call_insn_operand (XEXP (fnaddr, 0), word_mode))
> > >      {
> > >        fnaddr = convert_to_mode (word_mode, XEXP (fnaddr, 0), 1);
> > >        fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode, 
> > > fnaddr));
> > > diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> > > index f122fd8a0a3..bea3fd4b2e2 100644
> > > --- a/gcc/config/i386/i386-protos.h
> > > +++ b/gcc/config/i386/i386-protos.h
> > > @@ -371,6 +371,7 @@ extern int asm_preferred_eh_data_format (int, int);
> > >  extern enum attr_cpu ix86_schedule;
> > >  #endif
> > >
> > > +extern bool ix86_nopic_noplt_attribute_p (rtx call_op);
> > >  extern const char * ix86_output_call_insn (rtx_insn *insn, rtx call_op);
> > >  extern const char * ix86_output_indirect_jmp (rtx call_op);
> > >  extern const char * ix86_output_function_return (bool long_p);
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index 11770aa8a50..f89201684a8 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -16724,7 +16724,7 @@ ix86_ifunc_ref_local_ok (void)
> > >     This is currently used only with 64-bit or 32-bit GOT32X ELF targets
> > >     to call the function marked "noplt" indirectly.  */
> > >
> > > -static bool
> > > +bool
> > >  ix86_nopic_noplt_attribute_p (rtx call_op)
> > >  {
> > >    if (flag_pic || ix86_cmodel == CM_LARGE
> > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> > > index 52c02b6351a..d6ae3ee378a 100644
> > > --- a/gcc/config/i386/i386.md
> > > +++ b/gcc/config/i386/i386.md
> > > @@ -20133,7 +20133,7 @@ (define_insn "*call_got_x32"
> > >    [(call (mem:QI (zero_extend:DI
> > >                    (match_operand:SI 0 "GOT_memory_operand" "Bg")))
> > >          (match_operand 1))]
> > > -  "TARGET_X32"
> > > +  "TARGET_X32 && !TARGET_INDIRECT_BRANCH_REGISTER"
> > >  {
> > >    rtx fnaddr = gen_const_mem (DImode, XEXP (operands[0], 0));
> > >    return ix86_output_call_insn (insn, fnaddr);
> > > @@ -20338,7 +20338,7 @@ (define_insn "*call_value_got_x32"
> > >                 (zero_extend:DI
> > >                   (match_operand:SI 1 "GOT_memory_operand" "Bg")))
> > >               (match_operand 2)))]
> > > -  "TARGET_X32"
> > > +  "TARGET_X32 && !TARGET_INDIRECT_BRANCH_REGISTER"
> > >  {
> > >    rtx fnaddr = gen_const_mem (DImode, XEXP (operands[1], 0));
> > >    return ix86_output_call_insn (insn, fnaddr);
> > > @@ -20460,7 +20460,7 @@ (define_insn "*sibcall_value_pop_memory"
> > >         (plus:SI (reg:SI SP_REG)
> > >                  (match_operand:SI 3 "immediate_operand" "i")))
> > >     (unspec [(const_int 0)] UNSPEC_PEEPSIB)]
> > > -  "!TARGET_64BIT"
> > > +  "!TARGET_64BIT && !TARGET_INDIRECT_BRANCH_REGISTER"
> > >    "* return ix86_output_call_insn (insn, operands[1]);"
> > >    [(set_attr "type" "callv")])
> > >
> > > diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> > > index e733a474f9e..9a9101ed374 100644
> > > --- a/gcc/config/i386/predicates.md
> > > +++ b/gcc/config/i386/predicates.md
> > > @@ -670,7 +670,9 @@ (define_predicate "constant_call_address_operand"
> > >    (match_code "symbol_ref")
> > >  {
> > >    if (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC
> > > -      || flag_force_indirect_call)
> > > +      || flag_force_indirect_call
> > > +      || (TARGET_INDIRECT_BRANCH_REGISTER
> > > +          && ix86_nopic_noplt_attribute_p (op)))
> > >      return false;
> > >    if (TARGET_DLLIMPORT_DECL_ATTRIBUTES && SYMBOL_REF_DLLIMPORT_P (op))
> > >      return false;
>
> This is the patch I am testing.

No regressions.  I am checking it in.

Thanks.

-- 
H.J.

Reply via email to