On Mon, Sep 5, 2011 at 11:41 AM, Jakub Jelinek <[email protected]> wrote:
> On Mon, Sep 05, 2011 at 10:54:47AM +0200, Richard Guenther wrote:
>> > 2011-09-02 Jakub Jelinek <[email protected]>
>> >
>> > * common.opt: Add -ftree-strlen option.
>>
>> Maybe sth more generic? -foptimize-string-ops? Eventually guard
>> the existing string op foldings with that flag as well.
>
> I don't think other string op foldings should be dependent on that flag,
> those are already guarded with -fbuiltin-strcpy etc. I think we should have
> a switch just for this pass, and IMHO -foptimize-string-ops is way too
> generic, but of course it can be -ftrack-string-lengths or
> -foptimize-using-string-lengths or similar.
Ok, I guess we can bikeshed a bit about the flag ;) -ftree-strlen
at least doesn't sound very informative (nor do I like the tree- prefix
too much). -foptimize-strlen would be fine with me.
>> > --- gcc/opts.c.jj 2011-06-30 17:58:03.000000000 +0200
>> > +++ gcc/opts.c 2011-09-02 15:53:06.000000000 +0200
>> > @@ -484,6 +484,7 @@ static const struct default_options defa
>> > { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 },
>> > { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 },
>> > { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 },
>> > + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 },
>>
>> Why not -Os? Doesn't it remove strlen calls?
>
> Sometimes it does (though rarer than e.g. with -O2, because with -O2 we e.g.
> fold strcat with constant literal as second argument to strlen plus memcpy,
> but don't do that for -Os). On the other side it transforms strcpy calls
> into memcpy (where the latter is usually longer), etc. And strcat is almost
> always shorter too. So, an alternative would be for -Os enable this pass,
> but only do strlen removal in it (in addition to tracking lengths), nothing
> else. Even the strlen removal can grow the size if the string length
> expression isn't constant, or variable, or arithmetic operation on two
> operands.
Hm, ok. I guess we can leave it for -O2+ for now.
>> > +/* Array indexed by SSA_NAME_VERSION. 0 means unknown, positive value
>> > + is an index into strinfo vector, negative value stands for
>> > + string length of a string literal (~strlen). */
>> > +static int *ssa_ver_to_stridx;
>> > +
>> > +/* Number of currently active string indexes plus one. */
>> > +static int max_stridx;
>>
>> USe a VEC?
>
> For what? ssa_ver_to_stridx isn't growing, we shouldn't use it on the
> SSA_NAMEs added during the pass. And max_stridx isn't the length of
> the ssa_ver_to_stridx array, but how many string indexes are in use
> (i.e. maximum current length of stridx_to_strinfo vector).
Just for consistency across the compiler - I have no strong preferences,
esp. if we don't grow it (but you'd get bounds checking, etc. for free ...)
>> > +/* Free a decl_stridxlist_map. Callback for htab_delete. */
>> > +
>> > +static void
>> > +decl_to_stridxlist_free (void *item)
>> > +{
>> > + struct stridxlist *next;
>> > + struct stridxlist *list = ((struct decl_stridxlist_map *)
>> > item)->list.next;
>> > +
>> > + while (list)
>> > + {
>> > + next = list->next;
>> > + XDELETE (list);
>> > + list = next;
>> > + }
>> > + XDELETE (item);
>>
>> Maybe use an obstack or alloc-pool dependent on re-use?
>
> The number of those is limited by the string index limit, so it doesn't grow
> without bounds. It isn't reused, so yes, obstack could be used and avoid
> the above routine. If you prefer that, I'll change it.
optimizing malloc/free calls is cheap, so yes, I think we should use
an obstack here.
>> > +/* Create a new string index, or return 0 if reached limit. */
>> > +
>> > +static int
>> > +new_stridx (tree exp)
>> > +{
>> > + int idx;
>> > + if (max_stridx == 1000)
>>
>> I suppose make this a #define or --param
>
> Yeah. I guess this one could be a --param.
>
>> > +static void
>> > +find_equal_ptrs (tree ptr, int idx)
>> > +{
>> > + if (TREE_CODE (ptr) != SSA_NAME)
>> > + return;
>> > + while (1)
>> > + {
>> > + gimple stmt = SSA_NAME_DEF_STMT (ptr);
>> > + if (!gimple_assign_single_p (stmt)
>> > + && (!gimple_assign_cast_p (stmt)
>> > + || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))))
>> > + return;
>>
>> I'd prefer postive checks to guard code below. So, you're handling
>> SSA name copies, conversions from pointers and assignments from
>> invariants. You could simply do
>>
>> if (!is+gimple_assign (stmt))
>> return;
>> switch (gimple_assign_rhs_code (stmt))
>> {
>> case SSA_NAME:
>> ..
>> case ADDR_EXPR:
>> ..
>> CASE_CONVERT:
>> ...
>> default:
>> return;
>> }
>
> Ok, will change that.
>
>> > + if (!update_call_from_tree (gsi, rhs))
>> > + {
>> > + rhs = force_gimple_operand_gsi (gsi, rhs, true,
>> > NULL_TREE,
>> > + true, GSI_SAME_STMT);
>> > + if (!update_call_from_tree (gsi, rhs))
>>
>> if update_call_from_tree fails then gimplify_and_update_call_from_tree
>> will always succeed. See gimple_fold_call.
>
> I saw that call, I was worried that fold_all_builtins pass if
> gimplify_and_update_call_from_tree is called forces
> TODO_update_address_taken. rhs will be just a sum of some constants and/or
> SSA_NAMEs though, and the call here is a builtin that ought to be
> non-throwing. So do you think just
> if (!update_call_from_tree
> gimplify_and_update_call_from_tree
> is safe for that case?
Yes. No idea why we force TODO_update_address_taken here - I suppose
it's because of memcpy folding exposing the possibility of removing some
TREE_ADDRESSABLE bits. It's clearly for optimization only.
>> > + len = fold_convert_loc (loc, type, si->length);
>> > + len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type,
>> > 1));
>> > + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true,
>> > + GSI_SAME_STMT);
>> > + if (gimple_call_num_args (stmt) == 2)
>> > + rhs = build_call_expr_loc (loc, fn, 3, dst, src, len);
>> > + else
>> > + rhs = build_call_expr_loc (loc, fn, 4, dst, src, len,
>> > + gimple_call_arg (stmt, 2));
>> > + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
>> > + {
>> > + fprintf (dump_file, "Optimizing: ");
>> > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
>> > + }
>> > + if (update_call_from_tree (gsi, rhs))
>>
>> Btw, it would be nice if you'd not use build_call_expr and then gimplify
>> the call but instead build a new gimple call directly ... or modify it
>> in-place.
>
> But then I'd need to set the lhs in a different stmt (the new call - as the
> number of arguments grows), adjust SSA_NAME_DEF_STMT, remove the old call,
> etc.
> If there is a helper for all of that, why not, but update_call_from_tree
> looked like a helper which does that kind of thing.
> In particular I need something like
> new_stmt = gimple_build_call (fn, nops, ...);
> gimple_call_set_lhs (new_stmt, lhs);
> move_ssa_defining_stmt_for_defs (new_stmt, stmt);
> gimple_set_vuse (new_stmt, gimple_vuse (stmt));
> gimple_set_vdef (new_stmt, gimple_vdef (stmt));
> gimple_set_location (new_stmt, gimple_location (stmt));
> gsi_replace (si_p, new_stmt, false);
Yeah, I suppose update_call_from_tree could use a piecewise variant ...
ok, let's defer this as a cleanup for whoever feels like updating some
more code.
Thanks,
Richard.
> Jakub
>