On Sun, Jun 25, 2023 at 3:23 PM Hongtao Liu <crazy...@gmail.com> wrote: > > On Sun, Jun 25, 2023 at 3:13 PM Hongtao Liu <crazy...@gmail.com> wrote: > > > > On Sun, Jun 25, 2023 at 1:52 PM Jan Beulich <jbeul...@suse.com> wrote: > > > > > > On 25.06.2023 06:42, Hongtao Liu wrote: > > > > On Wed, Jun 21, 2023 at 2:26 PM Jan Beulich via Gcc-patches > > > > <gcc-patches@gcc.gnu.org> wrote: > > > >> > > > >> +(define_code_iterator andor [and ior]) > > > >> +(define_code_attr nlogic [(and "nor") (ior "nand")]) > > > >> +(define_code_attr ternlog_nlogic [(and "0x11") (ior "0x77")]) > > > >> + > > > >> +(define_insn "*<nlogic><mode>3" > > > >> + [(set (match_operand:VI 0 "register_operand" "=v,v") > > > >> + (andor:VI > > > >> + (not:VI (match_operand:VI 1 "bcst_vector_operand" "%v,v")) > > > >> + (not:VI (match_operand:VI 2 "bcst_vector_operand" > > > >> "vBr,m"))))] > > > > I'm thinking of doing it in simplify_rtx or gimple match.pd to transform > > > > (and (not op1)) (not op2)) -> (not: (ior: op1 op2)) > > > > > > This wouldn't be a win (not + andn) -> (or + not), but what's > > > more important is ... > > > > > > > (ior (not op1) (not op2)) -> (not : (and op1 op2)) > > > > > > > > Even w/o avx512f, the transformation should also benefit since it > > > > takes less logic operations 3 -> 2.(or 2 -> 2 for pandn). > > > > > > ... that these transformations (from the, as per the doc, > > > canonical representation of nand and nor) are already occurring > > I see, there're already such simplifications in the gimple phase, so > > the question: is there any need for and/ior:not not pattern? > > Can you provide a testcase to demonstrate that and/ior: not not > > pattern is needed? > > typedef int v4si __attribute__((vector_size(16))); > v4si > foo1 (v4si a, v4si b) > { > return ~a & ~b; > } > > I only gimple have optimized it to > > <bb 2> [local count: 1073741824]: > # DEBUG BEGIN_STMT > _1 = a_2(D) | b_3(D); > _4 = ~_1; > return _4; > > > But rtl still try to match > > (set (reg:V4SI 86) > (and:V4SI (not:V4SI (reg:V4SI 88)) > (not:V4SI (reg:V4SI 89)))) > > Hmm. In rtl, we're using xor -1 for not, so it's
(insn 8 7 9 2 (set (reg:V4SI 87) (ior:V4SI (reg:V4SI 88) (reg:V4SI 89))) "/app/example.cpp":6:15 6830 {*iorv4si3} (expr_list:REG_DEAD (reg:V4SI 89) (expr_list:REG_DEAD (reg:V4SI 88) (nil)))) (insn 9 8 14 2 (set (reg:V4SI 86) (xor:V4SI (reg:V4SI 87) (const_vector:V4SI [ (const_int -1 [0xffffffffffffffff]) repeated x4 ]))) "/app/example.cpp":6:18 6792 {*one_cmplv4si2} Then simplified to > (set (reg:V4SI 86) > (and:V4SI (not:V4SI (reg:V4SI 88)) > (not:V4SI (reg:V4SI 89)))) > by 3565 case XOR: 3566 if (trueop1 == CONST0_RTX (mode)) 3567 return op0; 3568 if (INTEGRAL_MODE_P (mode) && trueop1 == CONSTM1_RTX (mode)) 3569 return simplify_gen_unary (NOT, mode, op0, mode); and 1018 /* Apply De Morgan's laws to reduce number of patterns for machines 1019 with negating logical insns (and-not, nand, etc.). If result has 1020 only one NOT, put it first, since that is how the patterns are 1021 coded. */ 1022 if (GET_CODE (op) == IOR || GET_CODE (op) == AND) 1023 { 1024 rtx in1 = XEXP (op, 0), in2 = XEXP (op, 1); 1025 machine_mode op_mode; 1026 1027 op_mode = GET_MODE (in1); 1028 in1 = simplify_gen_unary (NOT, op_mode, in1, op_mode); 1029 1030 op_mode = GET_MODE (in2); 1031 if (op_mode == VOIDmode) 1032 op_mode = mode; 1033 in2 = simplify_gen_unary (NOT, op_mode, in2, op_mode); 1034 1035 if (GET_CODE (in2) == NOT && GET_CODE (in1) != NOT) 1036 std::swap (in1, in2); 1037 1038 return gen_rtx_fmt_ee (GET_CODE (op) == IOR ? AND : IOR, 1039 mode, in1, in2); 1040 } Ok, got it, and/ior:not not pattern LGTM then. > > > in common code, _if_ no suitable insn can be found. That was at > > > least the conclusion I drew from looking around a lot, supported > > > by the code that's generated prior to this change. > > > > > > Jan > > > > > > > > -- > > BR, > > Hongtao > > > > -- > BR, > Hongtao -- BR, Hongtao