https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821

--- Comment #20 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 20 Nov 2017, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821
> 
> --- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> (In reply to Uroš Bizjak from comment #17)
> > Hm, even with the latest patch, the testcase from comment #5:
> > still compiles to:
> > 
> >         movl    %esi, %eax
> >         movw    %si, (%rdi)
> >         notl    %esi
> >         notl    %eax
> >         movb    %sil, 3(%rdi)
> >         movb    %ah, 2(%rdi)
> >         ret
> 
> The reason for that is that the IL is something the bswap framework can't
> handle.  Let's look just at the simplified:
> void baz (char *buf, unsigned int data)
> {
>   buf[2] = ~data >> 8;
>   buf[3] = ~data;
> }
> 
>   _1 = ~data_6(D);
>   _2 = _1 >> 8;
>   _3 = (char) _2;
>   MEM[(char *)buf_7(D) + 2B] = _3;
>   _4 = (char) data_6(D);
>   _5 = ~_4;
>   MEM[(char *)buf_7(D) + 3B] = _5;
> 
> If it was instead:
>   _1 = ~data_6(D);
>   _2 = _1 >> 8;
>   _3 = (char) _2;
>   MEM[(char *)buf_7(D) + 2B] = _3;
>   _4 = (char) _1;
>   MEM[(char *)buf_7(D) + 3B] = _4;
> then it would handle that.  So I think it is a missed optimization in FRE or
> whatever else does SCCVN, or something match.pd should handle.

Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c        (revision 254945)
+++ gcc/tree-ssa-sccvn.c        (working copy)
@@ -3632,6 +3632,38 @@ visit_nary_op (tree lhs, gassign *stmt)
                }
            }
        }
+    case BIT_NOT_EXPR:
+      {
+        if (TREE_CODE (rhs1) == SSA_NAME)
+         {
+           gassign *def = dyn_cast <gassign *> (SSA_NAME_DEF_STMT 
(rhs1));
+           if (def
+               && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)))
+             {
+               tree ops[3] = {};
+               tree rhs11 = gimple_assign_rhs1 (def);
+               if (TYPE_PRECISION (TREE_TYPE (rhs11))
+                   >= TYPE_PRECISION (TREE_TYPE (rhs1)))
+                 {
+                   ops[0] = rhs11;
+                   tree tem = vn_nary_op_lookup_pieces (1, BIT_NOT_EXPR,
+                                                        TREE_TYPE 
(rhs11),
+                                                        ops, NULL);
+                   if (tem)
+                     {
+                       ops[0] = tem;
+                       result = vn_nary_build_or_lookup (NOP_EXPR, type, 
ops);
+                       if (result)
+                         {
+                           bool changed = set_ssa_val_to (lhs, result);
+                           vn_nary_op_insert_stmt (stmt, result);
+                           return changed;
+                         }
+                     }
+                 }
+             }
+         }
+      }
     default:;
     }



> As for:
> > void baz (char *buf, unsigned int data)
> > {
> >   buf[0] = data >> 8;
> >   buf[1] = data;
> > }
> not using movbew, that is something that should be done in the backend.
> For the middle-end, we don't have bswap16 and consider {L,R}ROTATE_EXPR by 8
> as the canonical 16-bit byte swap.  Please also have a look:
> unsigned short
> baz (unsigned short *buf)
> {
>   unsigned short a = buf[0];
>   return ((unsigned short) (a >> 8)) | (unsigned short) (a << 8);
> }
> where we could also emit movbew instead of movw + rolw (if it is actually a
> win).  Thus, I think i386.md should provide patterns for combine (or peephole2
> if the former doesn't work for some reason) for this.
> 
>

Reply via email to