https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821
--- Comment #20 from rguenther at suse dot de <rguenther at suse dot de> --- On Mon, 20 Nov 2017, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821 > > --- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to Uroš Bizjak from comment #17) > > Hm, even with the latest patch, the testcase from comment #5: > > still compiles to: > > > > movl %esi, %eax > > movw %si, (%rdi) > > notl %esi > > notl %eax > > movb %sil, 3(%rdi) > > movb %ah, 2(%rdi) > > ret > > The reason for that is that the IL is something the bswap framework can't > handle. Let's look just at the simplified: > void baz (char *buf, unsigned int data) > { > buf[2] = ~data >> 8; > buf[3] = ~data; > } > > _1 = ~data_6(D); > _2 = _1 >> 8; > _3 = (char) _2; > MEM[(char *)buf_7(D) + 2B] = _3; > _4 = (char) data_6(D); > _5 = ~_4; > MEM[(char *)buf_7(D) + 3B] = _5; > > If it was instead: > _1 = ~data_6(D); > _2 = _1 >> 8; > _3 = (char) _2; > MEM[(char *)buf_7(D) + 2B] = _3; > _4 = (char) _1; > MEM[(char *)buf_7(D) + 3B] = _4; > then it would handle that. So I think it is a missed optimization in FRE or > whatever else does SCCVN, or something match.pd should handle. Index: gcc/tree-ssa-sccvn.c =================================================================== --- gcc/tree-ssa-sccvn.c (revision 254945) +++ gcc/tree-ssa-sccvn.c (working copy) @@ -3632,6 +3632,38 @@ visit_nary_op (tree lhs, gassign *stmt) } } } + case BIT_NOT_EXPR: + { + if (TREE_CODE (rhs1) == SSA_NAME) + { + gassign *def = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (rhs1)); + if (def + && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def))) + { + tree ops[3] = {}; + tree rhs11 = gimple_assign_rhs1 (def); + if (TYPE_PRECISION (TREE_TYPE (rhs11)) + >= TYPE_PRECISION (TREE_TYPE (rhs1))) + { + ops[0] = rhs11; + tree tem = vn_nary_op_lookup_pieces (1, BIT_NOT_EXPR, + TREE_TYPE (rhs11), + ops, NULL); + if (tem) + { + ops[0] = tem; + result = vn_nary_build_or_lookup (NOP_EXPR, type, ops); + if (result) + { + bool changed = set_ssa_val_to (lhs, result); + vn_nary_op_insert_stmt (stmt, result); + return changed; + } + } + } + } + } + } default:; } > As for: > > void baz (char *buf, unsigned int data) > > { > > buf[0] = data >> 8; > > buf[1] = data; > > } > not using movbew, that is something that should be done in the backend. > For the middle-end, we don't have bswap16 and consider {L,R}ROTATE_EXPR by 8 > as the canonical 16-bit byte swap. Please also have a look: > unsigned short > baz (unsigned short *buf) > { > unsigned short a = buf[0]; > return ((unsigned short) (a >> 8)) | (unsigned short) (a << 8); > } > where we could also emit movbew instead of movw + rolw (if it is actually a > win). Thus, I think i386.md should provide patterns for combine (or peephole2 > if the former doesn't work for some reason) for this. > >