On Thu, Apr 28, 2016 at 8:36 PM, Bernd Schmidt <bschm...@redhat.com> wrote:
> On 01/18/2016 10:22 AM, Richard Biener wrote:
>>
>> See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171 - the
>> inline expansion
>> for small sizes and equality compares should be done on GIMPLE.  Today the
>> strlen pass might be an appropriate place to do this given its
>> superior knowledge
>> about string lengths.
>>
>> The idea of turning eq feeding memcmp into a special memcmp_eq is good but
>> you have to avoid doing that too early - otherwise you'd lose on
>>
>>    res = memcmp (p, q, sz);
>>    if (memcmp (p, q, sz) == 0)
>>     ...
>>
>> that is, you have to make sure CSE got the chance to common the two calls.
>> This is why I think this kind of transform needs to happen in specific
>> places
>> (like during strlen opt) rather than in generic folding.
>
>
> Ok, here's an update. I kept pieces of your patch from that PR, but also
> translating memcmps larger than a single operation into memcmp_eq as in my
> previous patch.
>
> Then, I added by_pieces infrastructure for memcmp expansion. To avoid any
> more code duplication in this area, I abstracted the existing code and
> converted it to C++ classes since that seemed to fit pretty well.
>
> There are a few possible ways I could go with this, which is why I'm posting
> it more as a RFD at this point.
>  - should store_by_pieces be eliminated in favour of doing just
>    move_by_pieces with constfns?
>  - the C++ification could be extended, making move_by_pieces_d and
>    compare_by_pieces_d classes inheriting from a common base. This
>    would get rid of the callbacks, replacing them with virtuals,
>    and also make some of the current struct members private.
>  - could move all of the by_pieces stuff out into a separate file?
>
> Later, I think we'll also want to extend this to allow vector mode
> operations, but I think that's a separate patch.
>
> So, opinions what I should be doing with this patch? FWIW it bootstraps and
> tests OK on x86_64-linux.

+struct pieces_addr
+{
...
+  void *m_cfndata;
+public:
+  pieces_addr (rtx, bool, by_pieces_constfn, void *);

unless you strick private: somewhere the public: is redundant

Index: gcc/tree-ssa-forwprop.c
===================================================================
--- gcc/tree-ssa-forwprop.c     (revision 235474)
+++ gcc/tree-ssa-forwprop.c     (working copy)
@@ -1435,6 +1435,76 @@ simplify_builtin_call (gimple_stmt_itera
            }
        }
       break;
+
+    case BUILT_IN_MEMCMP:
+      {

I think this doesn't belong in forwprop.  If we want to stick it into
a pass rather than
folding it should be in tree-ssa-strlen.c.

+       if (tree_fits_uhwi_p (len)
+           && (leni = tree_to_uhwi (len)) <= GET_MODE_SIZE (word_mode)
+           && exact_log2 (leni) != -1
+           && (align1 = get_pointer_alignment (arg1)) >= leni * BITS_PER_UNIT
+           && (align2 = get_pointer_alignment (arg2)) >= leni * BITS_PER_UNIT)
+         {
+           location_t loc = gimple_location (stmt2);
+           tree type, off;
+           type = build_nonstandard_integer_type (leni * BITS_PER_UNIT, 1);
+           gcc_assert (GET_MODE_SIZE (TYPE_MODE (type)) == leni);
+           off = build_int_cst
+            (build_pointer_type_for_mode (char_type_node, ptr_mode, true), 0);
+           arg1 = build2_loc (loc, MEM_REF, type, arg1, off);
+           arg2 = build2_loc (loc, MEM_REF, type, arg2, off);
+           res = fold_convert_loc (loc, TREE_TYPE (res),
+                                   fold_build2_loc (loc, NE_EXPR,
+                                                    boolean_type_node,
+                                                    arg1, arg2));
+           gimplify_and_update_call_from_tree (gsi_p, res);
+           return true;
+         }

for this part see gimple_fold_builtin_memory_op handling of

      /* If we can perform the copy efficiently with first doing all loads
         and then all stores inline it that way.  Currently efficiently
         means that we can load all the memory into a single integer
         register which is what MOVE_MAX gives us.  */

we might want to share a helper yielding the type of the load/store
or NULL if not possible.

Note that we can handle size-1 memcmp even for ordered compares.

Jakub, where do you think this fits best?  Note that gimple-fold.c may
not use immediate uses but would have to match this from the
comparison (I still have to find a way to handle this in match.pd where
the result expression contains virtual operands in the not toplevel stmt).

Richard.

>
> Bernd

Reply via email to