On Mon, Sep 5, 2011 at 11:41 AM, Jakub Jelinek <ja...@redhat.com> wrote: > On Mon, Sep 05, 2011 at 10:54:47AM +0200, Richard Guenther wrote: >> > 2011-09-02 Jakub Jelinek <ja...@redhat.com> >> > >> > * common.opt: Add -ftree-strlen option. >> >> Maybe sth more generic? -foptimize-string-ops? Eventually guard >> the existing string op foldings with that flag as well. > > I don't think other string op foldings should be dependent on that flag, > those are already guarded with -fbuiltin-strcpy etc. I think we should have > a switch just for this pass, and IMHO -foptimize-string-ops is way too > generic, but of course it can be -ftrack-string-lengths or > -foptimize-using-string-lengths or similar.
Ok, I guess we can bikeshed a bit about the flag ;) -ftree-strlen at least doesn't sound very informative (nor do I like the tree- prefix too much). -foptimize-strlen would be fine with me. >> > --- gcc/opts.c.jj 2011-06-30 17:58:03.000000000 +0200 >> > +++ gcc/opts.c 2011-09-02 15:53:06.000000000 +0200 >> > @@ -484,6 +484,7 @@ static const struct default_options defa >> > { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 }, >> > { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 }, >> > { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 }, >> > + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 }, >> >> Why not -Os? Doesn't it remove strlen calls? > > Sometimes it does (though rarer than e.g. with -O2, because with -O2 we e.g. > fold strcat with constant literal as second argument to strlen plus memcpy, > but don't do that for -Os). On the other side it transforms strcpy calls > into memcpy (where the latter is usually longer), etc. And strcat is almost > always shorter too. So, an alternative would be for -Os enable this pass, > but only do strlen removal in it (in addition to tracking lengths), nothing > else. Even the strlen removal can grow the size if the string length > expression isn't constant, or variable, or arithmetic operation on two > operands. Hm, ok. I guess we can leave it for -O2+ for now. >> > +/* Array indexed by SSA_NAME_VERSION. 0 means unknown, positive value >> > + is an index into strinfo vector, negative value stands for >> > + string length of a string literal (~strlen). */ >> > +static int *ssa_ver_to_stridx; >> > + >> > +/* Number of currently active string indexes plus one. */ >> > +static int max_stridx; >> >> USe a VEC? > > For what? ssa_ver_to_stridx isn't growing, we shouldn't use it on the > SSA_NAMEs added during the pass. And max_stridx isn't the length of > the ssa_ver_to_stridx array, but how many string indexes are in use > (i.e. maximum current length of stridx_to_strinfo vector). Just for consistency across the compiler - I have no strong preferences, esp. if we don't grow it (but you'd get bounds checking, etc. for free ...) >> > +/* Free a decl_stridxlist_map. Callback for htab_delete. */ >> > + >> > +static void >> > +decl_to_stridxlist_free (void *item) >> > +{ >> > + struct stridxlist *next; >> > + struct stridxlist *list = ((struct decl_stridxlist_map *) >> > item)->list.next; >> > + >> > + while (list) >> > + { >> > + next = list->next; >> > + XDELETE (list); >> > + list = next; >> > + } >> > + XDELETE (item); >> >> Maybe use an obstack or alloc-pool dependent on re-use? > > The number of those is limited by the string index limit, so it doesn't grow > without bounds. It isn't reused, so yes, obstack could be used and avoid > the above routine. If you prefer that, I'll change it. optimizing malloc/free calls is cheap, so yes, I think we should use an obstack here. >> > +/* Create a new string index, or return 0 if reached limit. */ >> > + >> > +static int >> > +new_stridx (tree exp) >> > +{ >> > + int idx; >> > + if (max_stridx == 1000) >> >> I suppose make this a #define or --param > > Yeah. I guess this one could be a --param. > >> > +static void >> > +find_equal_ptrs (tree ptr, int idx) >> > +{ >> > + if (TREE_CODE (ptr) != SSA_NAME) >> > + return; >> > + while (1) >> > + { >> > + gimple stmt = SSA_NAME_DEF_STMT (ptr); >> > + if (!gimple_assign_single_p (stmt) >> > + && (!gimple_assign_cast_p (stmt) >> > + || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))) >> > + return; >> >> I'd prefer postive checks to guard code below. So, you're handling >> SSA name copies, conversions from pointers and assignments from >> invariants. You could simply do >> >> if (!is+gimple_assign (stmt)) >> return; >> switch (gimple_assign_rhs_code (stmt)) >> { >> case SSA_NAME: >> .. >> case ADDR_EXPR: >> .. >> CASE_CONVERT: >> ... >> default: >> return; >> } > > Ok, will change that. > >> > + if (!update_call_from_tree (gsi, rhs)) >> > + { >> > + rhs = force_gimple_operand_gsi (gsi, rhs, true, >> > NULL_TREE, >> > + true, GSI_SAME_STMT); >> > + if (!update_call_from_tree (gsi, rhs)) >> >> if update_call_from_tree fails then gimplify_and_update_call_from_tree >> will always succeed. See gimple_fold_call. > > I saw that call, I was worried that fold_all_builtins pass if > gimplify_and_update_call_from_tree is called forces > TODO_update_address_taken. rhs will be just a sum of some constants and/or > SSA_NAMEs though, and the call here is a builtin that ought to be > non-throwing. So do you think just > if (!update_call_from_tree > gimplify_and_update_call_from_tree > is safe for that case? Yes. No idea why we force TODO_update_address_taken here - I suppose it's because of memcpy folding exposing the possibility of removing some TREE_ADDRESSABLE bits. It's clearly for optimization only. >> > + len = fold_convert_loc (loc, type, si->length); >> > + len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, >> > 1)); >> > + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, >> > + GSI_SAME_STMT); >> > + if (gimple_call_num_args (stmt) == 2) >> > + rhs = build_call_expr_loc (loc, fn, 3, dst, src, len); >> > + else >> > + rhs = build_call_expr_loc (loc, fn, 4, dst, src, len, >> > + gimple_call_arg (stmt, 2)); >> > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) >> > + { >> > + fprintf (dump_file, "Optimizing: "); >> > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); >> > + } >> > + if (update_call_from_tree (gsi, rhs)) >> >> Btw, it would be nice if you'd not use build_call_expr and then gimplify >> the call but instead build a new gimple call directly ... or modify it >> in-place. > > But then I'd need to set the lhs in a different stmt (the new call - as the > number of arguments grows), adjust SSA_NAME_DEF_STMT, remove the old call, > etc. > If there is a helper for all of that, why not, but update_call_from_tree > looked like a helper which does that kind of thing. > In particular I need something like > new_stmt = gimple_build_call (fn, nops, ...); > gimple_call_set_lhs (new_stmt, lhs); > move_ssa_defining_stmt_for_defs (new_stmt, stmt); > gimple_set_vuse (new_stmt, gimple_vuse (stmt)); > gimple_set_vdef (new_stmt, gimple_vdef (stmt)); > gimple_set_location (new_stmt, gimple_location (stmt)); > gsi_replace (si_p, new_stmt, false); Yeah, I suppose update_call_from_tree could use a piecewise variant ... ok, let's defer this as a cleanup for whoever feels like updating some more code. Thanks, Richard. > Jakub >