On Mon, Oct 23, 2017 at 1:07 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Mon, Oct 23, 2017 at 12:27:15PM +0200, Uros Bizjak wrote:
>> On Mon, Oct 23, 2017 at 12:09 PM, Jakub Jelinek <ja...@redhat.com> wrote:
>> > On Sun, Oct 22, 2017 at 08:04:28PM +0200, Uros Bizjak wrote:
>> >> Hello!
>> >>
>> >> In PR 82628 Jakub figured out that insn patterns that consume carry
>> >> flag were not 100% correct. Due to this issue, combine is able to
>> >> simplify various CC_REG propagations that result in invalid code.
>> >>
>> >> Attached patch fixes (well, mitigates) the above problem by splitting
>> >> the double-mode compare after the reload, in the same way other
>> >> *_doubleword patterns are handled from "the beginning of the time".
>> >
>> > I'm afraid this is going to haunt us sooner or later, combine isn't the
>> > only pass that uses simplify-rtx.c infrastructure heavily and when we lie
>> > in the RTL pattern, eventually something will be simplified wrongly.
>> >
>> > So, at least we'd need to use UNSPEC for the pattern, like (only lightly
>> > tested so far) below.
>>
>> I agree with the above. Patterns that consume Carry flag are now
>> marked with (plus (ltu (...)), but effectively, they behave like
>> unspecs. So, I see no problem to change all SBB and ADC to unspec at
>> once, similar to the change you proposed in the patch.
>
> So like this (addcarry/subborrow defered to a separate patch)?
> Or do you want to use UNSPEC even for the unsigned comparison case,
> i.e. from the patch remove the predicates.md/constraints.md part,
> sub<mode>3_carry_ccc{,_1} and anything related to that?

Looking at the attached patch, I think, this won't be necessary
anymore. The pattern is quite important for 32bit targets, so this
fact warrants a couple of complicated patterns.

> As for addcarry/subborrow, the problem is that we expect in the pr67317*
> tests that combine is able to notice that the CF setter sets CF to
> unconditional 0 and matches the pattern.  With the patch I wrote
> we end up with the combiner trying to match an insn where the CCC
> is set from a TImode comparison:
> (parallel [
>         (set (reg:CC 17 flags)
>             (compare:CC (zero_extend:TI (plus:DI (reg/v:DI 92 [ a ])
>                         (reg/v:DI 94 [ c ])))
>                 (zero_extend:TI (reg/v:DI 94 [ c ]))))
>         (set (reg:DI 98)
>             (plus:DI (reg/v:DI 92 [ a ])
>                 (reg/v:DI 94 [ c ])))
>     ])
> So, either we need a define_insn_and_split pattern that would deal with
> that (for UNSPEC it would be the same thing, have a define_insn_and_split
> that would replace the (ltu...) with (const_int 0)), or perhaps be smarter
> during expansion, if we see the first argument is constant 0, expand it
> like a normal add instruction with CC setter.
>
> 2017-10-23  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/82628
>         * config/i386/predicates.md (x86_64_dwzext_immediate_operand): New.
>         * config/i386/constraints.md (Wf): New constraint.
>         * config/i386/i386.md (UNSPEC_SBB): New unspec.
>         (cmp<dwi>_doubleword): Removed.
>         (sub<mode>3_carry_ccc, *sub<mode>3_carry_ccc_1): New patterns.
>         (sub<mode>3_carry_ccgz): Use unspec instead of compare.
>         * config/i386/i386.c (ix86_expand_branch) <case E_TImode>: Don't
>         expand with cmp<dwi>_doubleword.  For LTU and GEU use
>         sub<mode>3_carry_ccc instead of sub<mode>3_carry_ccgz and use CCCmode.

OK.

Thanks,
Uros.

> --- gcc/config/i386/predicates.md.jj    2017-10-23 12:00:13.899355249 +0200
> +++ gcc/config/i386/predicates.md       2017-10-23 12:52:20.696576114 +0200
> @@ -366,6 +366,31 @@ (define_predicate "x86_64_hilo_int_opera
>      }
>  })
>
> +;; Return true if VALUE is a constant integer whose value is
> +;; x86_64_immediate_operand value zero extended from word mode to mode.
> +(define_predicate "x86_64_dwzext_immediate_operand"
> +  (match_code "const_int,const_wide_int")
> +{
> +  switch (GET_CODE (op))
> +    {
> +    case CONST_INT:
> +      if (!TARGET_64BIT)
> +       return UINTVAL (op) <= HOST_WIDE_INT_UC (0xffffffff);
> +      return UINTVAL (op) <= HOST_WIDE_INT_UC (0x7fffffff);
> +
> +    case CONST_WIDE_INT:
> +      if (!TARGET_64BIT)
> +       return false;
> +      return (CONST_WIDE_INT_NUNITS (op) == 2
> +             && CONST_WIDE_INT_ELT (op, 1) == 0
> +             && (trunc_int_for_mode (CONST_WIDE_INT_ELT (op, 0), SImode)
> +                 == (HOST_WIDE_INT) CONST_WIDE_INT_ELT (op, 0)));
> +
> +    default:
> +      gcc_unreachable ();
> +    }
> +})
> +
>  ;; Return true if size of VALUE can be stored in a sign
>  ;; extended immediate field.
>  (define_predicate "x86_64_immediate_size_operand"
> --- gcc/config/i386/constraints.md.jj   2017-10-23 12:00:13.850355874 +0200
> +++ gcc/config/i386/constraints.md      2017-10-23 12:52:20.697576102 +0200
> @@ -332,6 +332,11 @@ (define_constraint "Wd"
>     of it satisfies the e constraint."
>    (match_operand 0 "x86_64_hilo_int_operand"))
>
> +(define_constraint "Wf"
> +  "32-bit signed integer constant zero extended from word size
> +   to double word size."
> +  (match_operand 0 "x86_64_dwzext_immediate_operand"))
> +
>  (define_constraint "Z"
>    "32-bit unsigned integer constant, or a symbolic reference known
>     to fit that range (for immediate operands in zero-extending x86-64
> --- gcc/config/i386/i386.md.jj  2017-10-23 12:51:19.350356044 +0200
> +++ gcc/config/i386/i386.md     2017-10-23 12:52:20.701576051 +0200
> @@ -112,6 +112,7 @@ (define_c_enum "unspec" [
>    UNSPEC_STOS
>    UNSPEC_PEEPSIB
>    UNSPEC_INSN_FALSE_DEP
> +  UNSPEC_SBB
>
>    ;; For SSE/MMX support:
>    UNSPEC_FIX_NOTRUNC
> @@ -1273,26 +1274,6 @@ (define_expand "cmp<mode>_1"
>         (compare:CC (match_operand:SWI48 0 "nonimmediate_operand")
>                     (match_operand:SWI48 1 "<general_operand>")))])
>
> -(define_insn_and_split "cmp<dwi>_doubleword"
> -  [(set (reg:CCGZ FLAGS_REG)
> -       (compare:CCGZ
> -         (match_operand:<DWI> 1 "register_operand" "0")
> -         (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "ro<di>")))
> -   (clobber (match_scratch:<DWI> 0 "=r"))]
> -  ""
> -  "#"
> -  "reload_completed"
> -  [(set (reg:CC FLAGS_REG)
> -       (compare:CC (match_dup 1) (match_dup 2)))
> -   (parallel [(set (reg:CCGZ FLAGS_REG)
> -                  (compare: CCGZ
> -                    (match_dup 4)
> -                    (plus:DWIH
> -                      (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
> -                      (match_dup 5))))
> -             (clobber (match_dup 3))])]
> -  "split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], 
> &operands[3]);")
> -
>  (define_insn "*cmp<mode>_ccno_1"
>    [(set (reg FLAGS_REG)
>         (compare (match_operand:SWI 0 "nonimmediate_operand" "<r>,?m<r>")
> @@ -6911,13 +6892,46 @@ (define_insn "*subsi3_carry_zext"
>     (set_attr "pent_pair" "pu")
>     (set_attr "mode" "SI")])
>
> -(define_insn "*sub<mode>3_carry_ccgz"
> +(define_insn "sub<mode>3_carry_ccc"
> +  [(set (reg:CCC FLAGS_REG)
> +       (compare:CCC
> +         (zero_extend:<DWI> (match_operand:DWIH 1 "register_operand" "0"))
> +         (plus:<DWI>
> +           (ltu:<DWI> (reg:CC FLAGS_REG) (const_int 0))
> +           (zero_extend:<DWI>
> +             (match_operand:DWIH 2 "x86_64_sext_operand" "rmWe")))))
> +   (clobber (match_scratch:DWIH 0 "=r"))]
> +  ""
> +  "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
> +  [(set_attr "type" "alu")
> +   (set_attr "mode" "<MODE>")])
> +
> +(define_insn "*sub<mode>3_carry_ccc_1"
> +  [(set (reg:CCC FLAGS_REG)
> +       (compare:CCC
> +         (zero_extend:<DWI> (match_operand:DWIH 1 "register_operand" "0"))
> +         (plus:<DWI>
> +           (ltu:<DWI> (reg:CC FLAGS_REG) (const_int 0))
> +           (match_operand:<DWI> 2 "x86_64_dwzext_immediate_operand" "Wf"))))
> +   (clobber (match_scratch:DWIH 0 "=r"))]
> +  ""
> +{
> +  operands[3] = simplify_subreg (<MODE>mode, operands[2], <DWI>mode, 0);
> +  return "sbb{<imodesuffix>}\t{%3, %0|%0, %3}";
> +}
> +  [(set_attr "type" "alu")
> +   (set_attr "mode" "<MODE>")])
> +
> +;; The sign flag is set from the
> +;; (compare (match_dup 1) (plus:DWIH (ltu:DWIH ...) (match_dup 2)))
> +;; result, the overflow flag likewise, but the overflow flag is also
> +;; set if the (plus:DWIH (ltu:DWIH ...) (match_dup 2)) overflows.
> +(define_insn "sub<mode>3_carry_ccgz"
>    [(set (reg:CCGZ FLAGS_REG)
> -       (compare:CCGZ
> -         (match_operand:DWIH 1 "register_operand" "0")
> -         (plus:DWIH
> -           (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
> -           (match_operand:DWIH 2 "x86_64_general_operand" "rme"))))
> +       (unspec:CCGZ [(match_operand:DWIH 1 "register_operand" "0")
> +                     (match_operand:DWIH 2 "x86_64_general_operand" "rme")
> +                     (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))]
> +                    UNSPEC_SBB))
>     (clobber (match_scratch:DWIH 0 "=r"))]
>    ""
>    "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
> --- gcc/config/i386/i386.c.jj   2017-10-23 12:51:19.349356057 +0200
> +++ gcc/config/i386/i386.c      2017-10-23 12:52:20.711575924 +0200
> @@ -22378,27 +22378,45 @@ ix86_expand_branch (enum rtx_code code,
>         switch (code)
>           {
>           case LE: case LEU: case GT: case GTU:
> -           std::swap (op0, op1);
> +           std::swap (lo[0], lo[1]);
> +           std::swap (hi[0], hi[1]);
>             code = swap_condition (code);
>             /* FALLTHRU */
>
>           case LT: case LTU: case GE: case GEU:
>             {
> -             rtx (*cmp_insn) (rtx, rtx, rtx);
> +             rtx (*cmp_insn) (rtx, rtx);
> +             rtx (*sbb_insn) (rtx, rtx, rtx);
> +             bool uns = (code == LTU || code == GEU);
>
>               if (TARGET_64BIT)
> -               cmp_insn = gen_cmpti_doubleword;
> +               {
> +                 cmp_insn = gen_cmpdi_1;
> +                 sbb_insn
> +                   = uns ? gen_subdi3_carry_ccc : gen_subdi3_carry_ccgz;
> +               }
>               else
> -               cmp_insn = gen_cmpdi_doubleword;
> +               {
> +                 cmp_insn = gen_cmpsi_1;
> +                 sbb_insn
> +                   = uns ? gen_subsi3_carry_ccc : gen_subsi3_carry_ccgz;
> +               }
> +
> +             if (!nonimmediate_operand (lo[0], submode))
> +               lo[0] = force_reg (submode, lo[0]);
> +             if (!x86_64_general_operand (lo[1], submode))
> +               lo[1] = force_reg (submode, lo[1]);
> +
> +             if (!register_operand (hi[0], submode))
> +               hi[0] = force_reg (submode, hi[0]);
> +             if ((uns && !nonimmediate_operand (hi[1], submode))
> +                 || (!uns && !x86_64_general_operand (hi[1], submode)))
> +               hi[1] = force_reg (submode, hi[1]);
>
> -             if (!register_operand (op0, mode))
> -               op0 = force_reg (mode, op0);
> -             if (!x86_64_hilo_general_operand (op1, mode))
> -               op1 = force_reg (mode, op1);
> +             emit_insn (cmp_insn (lo[0], lo[1]));
> +             emit_insn (sbb_insn (gen_rtx_SCRATCH (submode), hi[0], hi[1]));
>
> -             emit_insn (cmp_insn (gen_rtx_SCRATCH (mode), op0, op1));
> -
> -             tmp = gen_rtx_REG (CCGZmode, FLAGS_REG);
> +             tmp = gen_rtx_REG (uns ? CCCmode : CCGZmode, FLAGS_REG);
>
>               ix86_expand_branch (code, tmp, const0_rtx, label);
>               return;
>
>
>         Jakub

Reply via email to