Re: [RFC, WIP] tree-ssa-strlen optimization pass

Richard Guenther Mon, 05 Sep 2011 02:51:15 -0700

On Mon, Sep 5, 2011 at 11:41 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Mon, Sep 05, 2011 at 10:54:47AM +0200, Richard Guenther wrote:
>> > 2011-09-02  Jakub Jelinek  <ja...@redhat.com>
>> >
>> >        * common.opt: Add -ftree-strlen option.
>>
>> Maybe sth more generic?  -foptimize-string-ops?  Eventually guard
>> the existing string op foldings with that flag as well.
>
> I don't think other string op foldings should be dependent on that flag,
> those are already guarded with -fbuiltin-strcpy etc.  I think we should have
> a switch just for this pass, and IMHO -foptimize-string-ops is way too
> generic, but of course it can be -ftrack-string-lengths or
> -foptimize-using-string-lengths or similar.


Ok, I guess we can bikeshed a bit about the flag ;)  -ftree-strlen
at least doesn't sound very informative (nor do I like the tree- prefix
too much). -foptimize-strlen would be fine with me.

>> > --- gcc/opts.c.jj       2011-06-30 17:58:03.000000000 +0200
>> > +++ gcc/opts.c  2011-09-02 15:53:06.000000000 +0200
>> > @@ -484,6 +484,7 @@ static const struct default_options defa
>> >     { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 },
>> >     { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 },
>> >     { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 },
>> > +    { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 },
>>
>> Why not -Os?  Doesn't it remove strlen calls?
>
> Sometimes it does (though rarer than e.g. with -O2, because with -O2 we e.g.
> fold strcat with constant literal as second argument to strlen plus memcpy,
> but don't do that for -Os).  On the other side it transforms strcpy calls
> into memcpy (where the latter is usually longer), etc.  And strcat is almost
> always shorter too.  So, an alternative would be for -Os enable this pass,
> but only do strlen removal in it (in addition to tracking lengths), nothing
> else.  Even the strlen removal can grow the size if the string length
> expression isn't constant, or variable, or arithmetic operation on two
> operands.

Hm, ok.  I guess we can leave it for -O2+ for now.

>> > +/* Array indexed by SSA_NAME_VERSION.  0 means unknown, positive value
>> > +   is an index into strinfo vector, negative value stands for
>> > +   string length of a string literal (~strlen).  */
>> > +static int *ssa_ver_to_stridx;
>> > +
>> > +/* Number of currently active string indexes plus one.  */
>> > +static int max_stridx;
>>
>> USe a VEC?
>
> For what?  ssa_ver_to_stridx isn't growing, we shouldn't use it on the
> SSA_NAMEs added during the pass.  And max_stridx isn't the length of
> the ssa_ver_to_stridx array, but how many string indexes are in use
> (i.e. maximum current length of stridx_to_strinfo vector).

Just for consistency across the compiler - I have no strong preferences,
esp. if we don't grow it (but you'd get bounds checking, etc. for free ...)

>> > +/* Free a decl_stridxlist_map.  Callback for htab_delete.  */
>> > +
>> > +static void
>> > +decl_to_stridxlist_free (void *item)
>> > +{
>> > +  struct stridxlist *next;
>> > +  struct stridxlist *list = ((struct decl_stridxlist_map *) 
>> > item)->list.next;
>> > +
>> > +  while (list)
>> > +    {
>> > +      next = list->next;
>> > +      XDELETE (list);
>> > +      list = next;
>> > +    }
>> > +  XDELETE (item);
>>
>> Maybe use an obstack or alloc-pool dependent on re-use?
>
> The number of those is limited by the string index limit, so it doesn't grow
> without bounds.  It isn't reused, so yes, obstack could be used and avoid
> the above routine.  If you prefer that, I'll change it.

optimizing malloc/free calls is cheap, so yes, I think we should use
an obstack here.

>> > +/* Create a new string index, or return 0 if reached limit.  */
>> > +
>> > +static int
>> > +new_stridx (tree exp)
>> > +{
>> > +  int idx;
>> > +  if (max_stridx == 1000)
>>
>> I suppose make this a #define or --param
>
> Yeah.  I guess this one could be a --param.
>
>> > +static void
>> > +find_equal_ptrs (tree ptr, int idx)
>> > +{
>> > +  if (TREE_CODE (ptr) != SSA_NAME)
>> > +    return;
>> > +  while (1)
>> > +    {
>> > +      gimple stmt = SSA_NAME_DEF_STMT (ptr);
>> > +      if (!gimple_assign_single_p (stmt)
>> > +         && (!gimple_assign_cast_p (stmt)
>> > +             || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))))
>> > +       return;
>>
>> I'd prefer postive checks to guard code below.  So, you're handling
>> SSA name copies, conversions from pointers and assignments from
>> invariants.  You could simply do
>>
>>  if (!is+gimple_assign (stmt))
>>    return;
>>  switch (gimple_assign_rhs_code (stmt))
>>   {
>>    case SSA_NAME:
>>   ..
>> case ADDR_EXPR:
>>   ..
>>   CASE_CONVERT:
>>  ...
>> default:
>> return;
>> }
>
> Ok, will change that.
>
>> > +             if (!update_call_from_tree (gsi, rhs))
>> > +               {
>> > +                 rhs = force_gimple_operand_gsi (gsi, rhs, true, 
>> > NULL_TREE,
>> > +                                                 true, GSI_SAME_STMT);
>> > +                 if (!update_call_from_tree (gsi, rhs))
>>
>> if update_call_from_tree fails then gimplify_and_update_call_from_tree
>> will always succeed.  See gimple_fold_call.
>
> I saw that call, I was worried that fold_all_builtins pass if
> gimplify_and_update_call_from_tree is called forces
> TODO_update_address_taken.  rhs will be just a sum of some constants and/or
> SSA_NAMEs though, and the call here is a builtin that ought to be
> non-throwing.  So do you think just
> if (!update_call_from_tree
>  gimplify_and_update_call_from_tree
> is safe for that case?

Yes.  No idea why we force TODO_update_address_taken here - I suppose
it's because of memcpy folding exposing the possibility of removing some
TREE_ADDRESSABLE bits.  It's clearly for optimization only.

>> > +  len = fold_convert_loc (loc, type, si->length);
>> > +  len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, 
>> > 1));
>> > +  len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true,
>> > +                                 GSI_SAME_STMT);
>> > +  if (gimple_call_num_args (stmt) == 2)
>> > +    rhs = build_call_expr_loc (loc, fn, 3, dst, src, len);
>> > +  else
>> > +    rhs = build_call_expr_loc (loc, fn, 4, dst, src, len,
>> > +                              gimple_call_arg (stmt, 2));
>> > +  if (dump_file && (dump_flags & TDF_DETAILS) != 0)
>> > +    {
>> > +      fprintf (dump_file, "Optimizing: ");
>> > +      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
>> > +    }
>> > +  if (update_call_from_tree (gsi, rhs))
>>
>> Btw, it would be nice if you'd not use build_call_expr and then gimplify
>> the call but instead build a new gimple call directly ... or modify it
>> in-place.
>
> But then I'd need to set the lhs in a different stmt (the new call - as the
> number of arguments grows), adjust SSA_NAME_DEF_STMT, remove the old call, 
> etc.
> If there is a helper for all of that, why not, but update_call_from_tree
> looked like a helper which does that kind of thing.
> In particular I need something like
>      new_stmt = gimple_build_call (fn, nops, ...);
>      gimple_call_set_lhs (new_stmt, lhs);
>      move_ssa_defining_stmt_for_defs (new_stmt, stmt);
>      gimple_set_vuse (new_stmt, gimple_vuse (stmt));
>      gimple_set_vdef (new_stmt, gimple_vdef (stmt));
>      gimple_set_location (new_stmt, gimple_location (stmt));
>      gsi_replace (si_p, new_stmt, false);

Yeah, I suppose update_call_from_tree could use a piecewise variant ...
ok, let's defer this as a cleanup for whoever feels like updating some
more code.

Thanks,
Richard.

>        Jakub
>

Re: [RFC, WIP] tree-ssa-strlen optimization pass

Reply via email to