Re: [PATCH 2/4] - relax strlen range optimization to avoid making assumptions about types

Martin Sebor Wed, 10 Oct 2018 20:18:40 -0700

On 10/10/2018 03:18 PM, Jeff Law wrote:

On 10/2/18 10:37 AM, Martin Sebor wrote:

[2/4] - Relax strlen range optimization to avoid making assumptions
        about types


This main part of this patch is to relax the strlen range
optimization to avoid relying on array types.  Instead, the function
either removes the upper bound of the strlen result altogether, or
constrains it by the size of referenced declaration (without
attempting to deal with common symbols).

A seemingly "big" change here is splitting up the static
get_range_strlen workhorse into two functions to make it easier
to follow.  The attached cc-99999-2-gimple-fold.c.diff-b shows
the diff for the file without considering whitespace changes.

An important change to the get_range_strlen function worth calling
out to is the introduction of the tight_bound local variable.
It controls whether the upper bound computed by the function is
suitable for optimization (the larger value) or warnings
(the smaller value).

Another important change here is replacing the type and fuzzy
arguments to get_range_strlen with a single enum.  The two arguments
were confusing and with all combinations of their possible values
being meaningful.  The original extern get_range_strlen overload
with the type and fuzzy arguments is retained in this patch to
keep the changes contained.  It is removed in [4/4].

Finally, a large number of tests needed adjusting here.  I also
added a few new tests to better exercise the changes.


gcc-99999-2.diff

[2/4] - Relax strlen range optimization to avoid making assumptions about types

gcc/ChangeLog:

        * calls.c (maybe_warn_nonstring_arg): Test for -1.
        * gimple-fold.c (get_range_strlen): Forward declare static.
        (get_range_strlen_tree): New helper.
        (get_range_strlen): Move tree code into get_range_strlen_tree.
        Replace type and fuzzy arguments with a single enum.
        Avoid optimizing ranges based on type, optimize on decl size intead.
        (get_range_strlen): New extern overload.
        (get_range_strlen): Use new overload above.
        (get_maxval_strlen): Declared static.  Assert preconditions.
        Use new overload above.
        (gimple_fold_builtin_strcpy): Adjust to pass enum.
        (gimple_fold_builtin_strncpy): Same.
        (gimple_fold_builtin_strcat): Same.
        (gimple_fold_builtin_fputs): Same.
        (gimple_fold_builtin_memory_chk): Same.
        (gimple_fold_builtin_stxcpy_chk): Same.
        (gimple_fold_builtin_stxncpy_chk): Same.
        (gimple_fold_builtin_snprintf_chk): Same.
        (gimple_fold_builtin_sprintf): Same.
        (gimple_fold_builtin_snprintf): Same.
        (gimple_fold_builtin_strlen): Simplify.  Call set_strlen_range.
        * gimple-fold.h (StrlenType): New enum.
        (get_maxval_strlen): Remove.
        (get_range_strlen): Change default argument value to true (strict).
        * tree-ssa-strlen.c (set_strlen_range): New function.
        (maybe_set_strlen_range): Call it.  Make static.
        (maybe_diag_stxncpy_trunc): Use new get_range_strlen overload.
        Adjust.
        * tree-ssa-strlen.h (set_strlen_range): Declare.

gcc/testsuite/ChangeLog:

        * g++.dg/init/strlen.C: New test.
        * gcc.c-torture/execute/strlen-5.c: New test.
        * gcc.c-torture/execute/strlen-6.c: New test.
        * gcc.c-torture/execute/strlen-7.c: New test.
        * gcc.dg/strlenopt-36.c: Remove tests for an overly aggressive
        optimization.
        * gcc.dg/strlenopt-40.c: Adjust tests to reflect a relaxed
        optimization, remove others for an overly aggressive optimization.
        * gcc.dg/strlenopt-45.c: Same.
        * gcc.dg/strlenopt-48.c: Adjust tests to reflect a relaxed
        optimization.
        * gcc.dg/strlenopt-51.c: Same.
        * gcc.dg/strlenopt-59.c: New test.

So from a review standpoint, a distinct patch that splits up the
function, but makes no behavioral changes would help.  That can be gated
in quickly and it reduces the amount of noise one has to sort through in
the changes that affect behavior.  Though the git diff ignoring
whitespace definitely helped.


The part with no behavioral changes is patch 1/4.  This part is
meant to relax the optimization.  The other changes like adding
the enum are meant to help make the code readable.


Camelcase is verboten.  Please don't use it (StrlenType and its
associated enumeration values).   Make the enumeration type lower case
and make the enumeration values all caps.  That's standard practice.

Was this tested independently?  I applied patch #1 and this to my tree,
and I get a bunch of regressions.

builtin-snprintf-warn-?.c
builtin-sprintf-warn-?.c
pr79376.c

In all there's ~75 regressions with this patch.


I tested each patch individually but overlooked some regressions
among all the intermittent failures.  Ultimately, it only makes
sense to me to either apply the whole series or nothing.  I don't
think there are "nuggets" in there that could be applied
independently of the rest.  At least it wasn't my expectation
-- I only broke it up to make the review easier.


It's unclear if those are inherent in the patch, a result of previous
API changes (c_strlen in particular), or me incorrectly resolving
conflicts for various hunks.  Some may be fixed in #3 or #4.  In which
case we just need to be aware of the dependencies between the patches
from a testsuite regression standpoint.

It would be advisable to go back to Bernd's patch and see if his new
tests (strlenopt-59 and strlenopt-60) are covered by the new tests
you've included here.

Y'all approached changing the existing tests in significantly different
ways.  It's a bit hard to tease out if there's any other testsuite
changes from Bernd that we'd want to pick up.  If you could take a look
at the other changes Bernd made to the testsuite and see if you're
covering everything he did, it'd be helpful.


I kept individual assertions as long as they're verifying
a subrange of the max length (i.e., < PTRDIFF_MAX).   I got
rid of those where the range became essentially open ended,
and added more where I thought coverage could be improved.
My impression of Bernd's test suite changes is that he either
chucked the lot that failed or xfailed them wholesale.

I'm traveling next week so I won't have time to do much until
I get back.

Martin

Re: [PATCH 2/4] - relax strlen range optimization to avoid making assumptions about types

Reply via email to