On Sun, Jun 25, 2023 at 3:23 PM Hongtao Liu <[email protected]> wrote:
>
> On Sun, Jun 25, 2023 at 3:13 PM Hongtao Liu <[email protected]> wrote:
> >
> > On Sun, Jun 25, 2023 at 1:52 PM Jan Beulich <[email protected]> wrote:
> > >
> > > On 25.06.2023 06:42, Hongtao Liu wrote:
> > > > On Wed, Jun 21, 2023 at 2:26 PM Jan Beulich via Gcc-patches
> > > > <[email protected]> wrote:
> > > >>
> > > >> +(define_code_iterator andor [and ior])
> > > >> +(define_code_attr nlogic [(and "nor") (ior "nand")])
> > > >> +(define_code_attr ternlog_nlogic [(and "0x11") (ior "0x77")])
> > > >> +
> > > >> +(define_insn "*<nlogic><mode>3"
> > > >> + [(set (match_operand:VI 0 "register_operand" "=v,v")
> > > >> + (andor:VI
> > > >> + (not:VI (match_operand:VI 1 "bcst_vector_operand" "%v,v"))
> > > >> + (not:VI (match_operand:VI 2 "bcst_vector_operand"
> > > >> "vBr,m"))))]
> > > > I'm thinking of doing it in simplify_rtx or gimple match.pd to transform
> > > > (and (not op1)) (not op2)) -> (not: (ior: op1 op2))
> > >
> > > This wouldn't be a win (not + andn) -> (or + not), but what's
> > > more important is ...
> > >
> > > > (ior (not op1) (not op2)) -> (not : (and op1 op2))
> > > >
> > > > Even w/o avx512f, the transformation should also benefit since it
> > > > takes less logic operations 3 -> 2.(or 2 -> 2 for pandn).
> > >
> > > ... that these transformations (from the, as per the doc,
> > > canonical representation of nand and nor) are already occurring
> > I see, there're already such simplifications in the gimple phase, so
> > the question: is there any need for and/ior:not not pattern?
> > Can you provide a testcase to demonstrate that and/ior: not not
> > pattern is needed?
>
> typedef int v4si __attribute__((vector_size(16)));
> v4si
> foo1 (v4si a, v4si b)
> {
> return ~a & ~b;
> }
>
> I only gimple have optimized it to
>
> <bb 2> [local count: 1073741824]:
> # DEBUG BEGIN_STMT
> _1 = a_2(D) | b_3(D);
> _4 = ~_1;
> return _4;
>
>
> But rtl still try to match
>
> (set (reg:V4SI 86)
> (and:V4SI (not:V4SI (reg:V4SI 88))
> (not:V4SI (reg:V4SI 89))))
>
> Hmm.
In rtl, we're using xor -1 for not, so it's
(insn 8 7 9 2 (set (reg:V4SI 87)
(ior:V4SI (reg:V4SI 88)
(reg:V4SI 89))) "/app/example.cpp":6:15 6830 {*iorv4si3}
(expr_list:REG_DEAD (reg:V4SI 89)
(expr_list:REG_DEAD (reg:V4SI 88)
(nil))))
(insn 9 8 14 2 (set (reg:V4SI 86)
(xor:V4SI (reg:V4SI 87)
(const_vector:V4SI [
(const_int -1 [0xffffffffffffffff]) repeated x4
]))) "/app/example.cpp":6:18 6792 {*one_cmplv4si2}
Then simplified to
> (set (reg:V4SI 86)
> (and:V4SI (not:V4SI (reg:V4SI 88))
> (not:V4SI (reg:V4SI 89))))
>
by
3565 case XOR:
3566 if (trueop1 == CONST0_RTX (mode))
3567 return op0;
3568 if (INTEGRAL_MODE_P (mode) && trueop1 == CONSTM1_RTX (mode))
3569 return simplify_gen_unary (NOT, mode, op0, mode);
and
1018 /* Apply De Morgan's laws to reduce number of patterns for machines
1019 with negating logical insns (and-not, nand, etc.). If result has
1020 only one NOT, put it first, since that is how the patterns are
1021 coded. */
1022 if (GET_CODE (op) == IOR || GET_CODE (op) == AND)
1023 {
1024 rtx in1 = XEXP (op, 0), in2 = XEXP (op, 1);
1025 machine_mode op_mode;
1026
1027 op_mode = GET_MODE (in1);
1028 in1 = simplify_gen_unary (NOT, op_mode, in1, op_mode);
1029
1030 op_mode = GET_MODE (in2);
1031 if (op_mode == VOIDmode)
1032 op_mode = mode;
1033 in2 = simplify_gen_unary (NOT, op_mode, in2, op_mode);
1034
1035 if (GET_CODE (in2) == NOT && GET_CODE (in1) != NOT)
1036 std::swap (in1, in2);
1037
1038 return gen_rtx_fmt_ee (GET_CODE (op) == IOR ? AND : IOR,
1039 mode, in1, in2);
1040 }
Ok, got it, and/ior:not not pattern LGTM then.
> > > in common code, _if_ no suitable insn can be found. That was at
> > > least the conclusion I drew from looking around a lot, supported
> > > by the code that's generated prior to this change.
> > >
> > > Jan
> >
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao
--
BR,
Hongtao