On Fri, Apr 28, 2017 at 6:35 PM, Jeff Law <[email protected]> wrote: > On 04/25/2017 09:55 PM, Martin Sebor wrote: >> >> On 04/25/2017 04:05 PM, Jeff Law wrote: >>> >>> On 04/21/2017 03:33 PM, Martin Sebor wrote: >>>> >>>> Bug 77671 - missing -Wformat-overflow warning on sprintf overflow >>>> with "%s", is caused by gimple-fold.c transforming s{,n}printf >>>> calls with a plain "%s" format string into strcpy regardless of >>>> whether they are valid well before the -Wformtat-overflow warning >>>> has had a chance to validate them. >>>> >>>> The attached patch moves the transformation from gimple-fold.c >>>> into the gimple-ssa-sprintf.c pass, thus allowing the calls to >>>> be validated. Only valid calls (i.e., those that do not overflow >>>> the destination) are transformed. >>>> >>>> Martin >>>> >>>> gcc-77671.diff >>>> >>>> >>>> commit 2cd98763984ffb93606ee96ad658733b4c95c1e4 >>>> Author: Martin Sebor<[email protected]> >>>> Date: Wed Apr 12 18:36:26 2017 -0600 >>>> >>>> PR middle-end/77671 - missing -Wformat-overflow warning on >>>> sprintf overflow >>>> with %s >>>> gcc/ChangeLog: >>>> PR middle-end/77671 >>>> * gimple-fold.c (gimple_fold_builtin_sprintf): Make extern. >>>> (gimple_fold_builtin_snprintf): Same. >>>> * gimple-fold.h (gimple_fold_builtin_sprintf): Declare. >>>> (gimple_fold_builtin_snprintf): Same. >>>> * gimple-ssa-sprintf.c (get_format_string): Correct the >>>> detection >>>> of character types. >>>> (is_call_safe): New function. >>>> (try_substitute_return_value): Call it. >>>> (try_simplify_call): New function. >>>> (pass_sprintf_length::handle_gimple_call): Call it. >>>> gcc/testsuite/ChangeLog: >>>> PR middle-end/77671 >>>> * gcc.dg/tree-ssa/builtin-sprintf-7.c: New test. >>>> * gcc.dg/tree-ssa/builtin-sprintf-8.c: New test. >>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust. >>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-2.c: Adjust. >>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Adjust. >>> >>> I assume this went through the usual regression testing cycle? I'm a >>> bit surprised nothing failed due to moving the transformation later in >>> the pipeline. >> >> >> It did. There was one regression I neglected to mention. A test >> exercising -fexec-charset (bug 20110) fails because the sprintf >> pass assumes the execution character set is the same as the host >> character set. With -fexec-charset set to EBCDIC it gets just >> as confused as the -Wformat warning does (this is subject of >> the still unresolved bug 20110). I've opened bug 80523 for >> the new problem and I'm testing a patch that handles it. >> >> >>>> diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c >>>> index 2474391..07d6897 100644 >>>> --- a/gcc/gimple-ssa-sprintf.c >>>> +++ b/gcc/gimple-ssa-sprintf.c >>>> @@ -373,7 +373,16 @@ get_format_string (tree format, location_t *ploc) >>>> if (TREE_CODE (format) != STRING_CST) >>>> return NULL; >>>> - if (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format))) != >>>> char_type_node) >>>> + tree type = TREE_TYPE (format); >>>> + if (TREE_CODE (type) == ARRAY_TYPE) >>>> + type = TREE_TYPE (type); >>>> + >>>> + /* Can't just test that: >>>> + TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format))) != >>>> char_type_node >>>> + See bug 79062. */ >>>> + if (TREE_CODE (type) != INTEGER_TYPE >>>> + || TYPE_MODE (type) != TYPE_MODE (char_type_node) >>>> + || TYPE_PRECISION (type) != TYPE_PRECISION (char_type_node)) >>>> { >>>> /* Wide format string. */ >>>> return NULL; >>> >>> In the referenced BZ Richi mentions how fold-const.c checks for this >>> case and also hints that you might want t look at tree-ssa-strlen as >>> well. That seems wise. It also seems wise to factor that check and use >>> it from all the identified locations. I don't like that this uses a >>> different check than what's in fold-const.c. >> >> >> I think what I did comes from tree-ssa-strlen.c (Richi's other >> suggestion in bug 79062). In either case I don't fully understand >> why the existing code doesn't work. >> >>> It's also not clear if this change is a requirement to address 77671 or >>> not. If so ISTM that we fix 79062 first, then 77671. If not we can fix >>> them independently. In both cases the fix for 79062 can be broken out >>> into its own patch. >> >> >> I suppose that makes sense. The hunk above doesn't fully fix >> 79062. It just lets the sprintf return value optimization take >> place with -flto. >> >>>> @@ -3278,31 +3287,21 @@ get_destination_size (tree dest) >>>> return HOST_WIDE_INT_M1U; >>>> } >>>> -/* Given a suitable result RES of a call to a formatted output >>>> function >>>> - described by INFO, substitute the result for the return value of >>>> - the call. The result is suitable if the number of bytes it >>>> represents >>>> - is known and exact. A result that isn't suitable for substitution >>>> may >>>> - have its range set to the range of return values, if that is known. >>>> - Return true if the call is removed and gsi_next should not be >>>> performed >>>> - in the caller. */ >>>> - >>>> static bool >>>> -try_substitute_return_value (gimple_stmt_iterator *gsi, >>>> - const pass_sprintf_length::call_info &info, >>>> - const format_result &res) >>>> +is_call_safe (const pass_sprintf_length::call_info &info, >>>> + const format_result &res, bool under4k, >>>> + unsigned HOST_WIDE_INT retval[2]) >>> >>> You need a function comment for is_call_safe. >>> >>> >>> >>>> @@ -3423,6 +3458,30 @@ try_substitute_return_value >>>> (gimple_stmt_iterator *gsi, >>>> return removed; >>>> } >>>> +static bool >>>> +try_simplify_call (gimple_stmt_iterator *gsi, >>>> + const pass_sprintf_length::call_info &info, >>>> + const format_result &res) >>> >>> Needs a function comment. >>> >>> >>> >>> So there's the minor nits to add function comments. The big question >>> is the stuff for 79062 and a confirmation that this went through a full >>> regression test cycle. >> >> >> I think I may have included the (partial) the fix for 79062 to >> get some tests to pass but I'm not 100% sure. >> >> Let me first submit the fix for the -fexec-charset limitation >> (bug 80523), see if I can separate out the partial fix for 79062, >> and then resubmit this patch. > > OK. I think 80523 is pretty close. I mentioned one implementation detail > (initialize translation table once per file rather than once per function) > and one question about what happens for long translated strings. I suspect > you'll have that one good-to-go shortly. > >> >> FWIW, my fix for bug 79062 is only partial (it gets the pass >> to run but the warnings are still not issued). I don't quite >> understand what prevents the warning flag(s) from getting set >> when -flto is used. This seems to be a bigger problem than >> just the sprintf pass not doing something just right. > > I've never dug deeply in the LTO stuff, but I believe we stream the compiler > flags, so it could be something there.
We do. > Alternately you might be running into a case where in LTO mode we recreate > base types. Look for a type equality tester that goes beyond just testing > pointer equality. > > ie, in LTO I think we'll create a type based on the streamed data, but I > also think we'll create various basic types. Thus in LTO mode pointer > equality may not be sufficient. We make sure that for most basic types we end up re-using them where possible. char_type_node is an example where that generally doesn't work because it's value depends on a command-line flag. Richard. > jeff
