On Mon, Jan 20, 2014 at 01:35:05PM +0100, Richard Biener wrote:
> > tree-ssa-strlen.c apparently both doesn't this case (unknown first strlen,
> > known second strlen or movstr pattern, will only transform that if the
> > length of the resulting string is needed afterwards), and isn't run
> > for -Os or -O1 anyway.
> 
> Well, I'm not sure under which circumstances this should be an 
> unconditional win anyway (I expect the strcat library fn to be
> optimized well enough, and only if you can avoid the strlen on the
> dest in the end it will be profitable)

No, while the strcat library fn can be very optimized, it still has no info
about how long the second parameter is.  strcat is implementation is
typically an optimized strchr (dst, 0) followed by an optimized strcpy,
where the two can perhaps avoid some alignment adjustments or something.
But the strcpy still has to for each word or whatever chunk it reads test
for terminating zeros, while if you do an (optimized) strlen followed by
memcpy where you already know the length, that is a win.

> > But I guess if we optimize it again, your testcase would crash again, right?
> 
> Right.  We can apply the optimization at RTL expansion time though,
> or handle the folding completely in gimple-fold with not needing to
> dispatch to the gimplifier.

After playing with the testcase in a debugger, my strong preference at
least for the 4.8 branch would be just a global flag (or context flag) to 
prevent
the nested folding.  I think the only problematic thing is what starts with
the avoid_folding_inline_builtin check in gimple_fold_builtin, and we should
just prevent that from happening when called from within
gimplify_and_update_call_from_tree (or just during that call when called
from gimple_fold_call?).

Normally, if folding of a builtin folds it into a call to some other
builtin, that other builtin is folded right away, so the common case is
optimized immediately, the only problem is if gimple_fold_builtin tries
harder to optimize using maximum lengths or exact length (as in this case).
And, for this it wouldn't even help if we handled STRCAT/STRCAT_CHK
specially too and passed the src length to the folder routine, because
gimple_fold_builtin first folds normally and only if that fails, attempts
harder.

        Jakub

Reply via email to