On Thu, Apr 28, 2016 at 8:36 PM, Bernd Schmidt <[email protected]> wrote:
> On 01/18/2016 10:22 AM, Richard Biener wrote:
>>
>> See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171 - the
>> inline expansion
>> for small sizes and equality compares should be done on GIMPLE. Today the
>> strlen pass might be an appropriate place to do this given its
>> superior knowledge
>> about string lengths.
>>
>> The idea of turning eq feeding memcmp into a special memcmp_eq is good but
>> you have to avoid doing that too early - otherwise you'd lose on
>>
>> res = memcmp (p, q, sz);
>> if (memcmp (p, q, sz) == 0)
>> ...
>>
>> that is, you have to make sure CSE got the chance to common the two calls.
>> This is why I think this kind of transform needs to happen in specific
>> places
>> (like during strlen opt) rather than in generic folding.
>
>
> Ok, here's an update. I kept pieces of your patch from that PR, but also
> translating memcmps larger than a single operation into memcmp_eq as in my
> previous patch.
>
> Then, I added by_pieces infrastructure for memcmp expansion. To avoid any
> more code duplication in this area, I abstracted the existing code and
> converted it to C++ classes since that seemed to fit pretty well.
>
> There are a few possible ways I could go with this, which is why I'm posting
> it more as a RFD at this point.
> - should store_by_pieces be eliminated in favour of doing just
> move_by_pieces with constfns?
> - the C++ification could be extended, making move_by_pieces_d and
> compare_by_pieces_d classes inheriting from a common base. This
> would get rid of the callbacks, replacing them with virtuals,
> and also make some of the current struct members private.
> - could move all of the by_pieces stuff out into a separate file?
>
> Later, I think we'll also want to extend this to allow vector mode
> operations, but I think that's a separate patch.
>
> So, opinions what I should be doing with this patch? FWIW it bootstraps and
> tests OK on x86_64-linux.
+struct pieces_addr
+{
...
+ void *m_cfndata;
+public:
+ pieces_addr (rtx, bool, by_pieces_constfn, void *);
unless you strick private: somewhere the public: is redundant
Index: gcc/tree-ssa-forwprop.c
===================================================================
--- gcc/tree-ssa-forwprop.c (revision 235474)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -1435,6 +1435,76 @@ simplify_builtin_call (gimple_stmt_itera
}
}
break;
+
+ case BUILT_IN_MEMCMP:
+ {
I think this doesn't belong in forwprop. If we want to stick it into
a pass rather than
folding it should be in tree-ssa-strlen.c.
+ if (tree_fits_uhwi_p (len)
+ && (leni = tree_to_uhwi (len)) <= GET_MODE_SIZE (word_mode)
+ && exact_log2 (leni) != -1
+ && (align1 = get_pointer_alignment (arg1)) >= leni * BITS_PER_UNIT
+ && (align2 = get_pointer_alignment (arg2)) >= leni * BITS_PER_UNIT)
+ {
+ location_t loc = gimple_location (stmt2);
+ tree type, off;
+ type = build_nonstandard_integer_type (leni * BITS_PER_UNIT, 1);
+ gcc_assert (GET_MODE_SIZE (TYPE_MODE (type)) == leni);
+ off = build_int_cst
+ (build_pointer_type_for_mode (char_type_node, ptr_mode, true), 0);
+ arg1 = build2_loc (loc, MEM_REF, type, arg1, off);
+ arg2 = build2_loc (loc, MEM_REF, type, arg2, off);
+ res = fold_convert_loc (loc, TREE_TYPE (res),
+ fold_build2_loc (loc, NE_EXPR,
+ boolean_type_node,
+ arg1, arg2));
+ gimplify_and_update_call_from_tree (gsi_p, res);
+ return true;
+ }
for this part see gimple_fold_builtin_memory_op handling of
/* If we can perform the copy efficiently with first doing all loads
and then all stores inline it that way. Currently efficiently
means that we can load all the memory into a single integer
register which is what MOVE_MAX gives us. */
we might want to share a helper yielding the type of the load/store
or NULL if not possible.
Note that we can handle size-1 memcmp even for ordered compares.
Jakub, where do you think this fits best? Note that gimple-fold.c may
not use immediate uses but would have to match this from the
comparison (I still have to find a way to handle this in match.pd where
the result expression contains virtual operands in the not toplevel stmt).
Richard.
>
> Bernd