On 10/30/2018 09:27 AM, Jeff Law wrote:
On 10/29/18 5:51 PM, Martin Sebor wrote:
The missing nul detection fails when the argument of the %s or
similar sprintf directive is the address of a non-nul character
constant such as in:
const char c = 'a';
int f (void)
{
return snprintf (0, 0, "%s", &c);
}
This is because the string_constant function only succeeds for
arguments that refer to STRRING_CSTs, not to individual characters.
For the same reason, calls to memchr() such as the one below aren't
folded into constants:
const char d = '\0';
void* g (void)
{
return memchr (&d, 0, 1);
}
To detect and diagnose the missing nul in the first example and
to fold the second, the attached patch modifies string_constant
to return a synthesized STRING_CST object for such references
(while also indicating whether such an object is properly
nul-terminated).
Tested on x86_64-linux.
Martin
gcc-87756.diff
PR tree-optimization/87756 - missing unterminated argument warning using
address of a constant character
gcc/ChangeLog:
PR tree-optimization/87756
* expr.c (string_constant): Synthesize a string literal from
the address of a constant character.
* tree.c (build_string_literal): Add an argument.
* tree.h (build_string_literal): Same.
gcc/testsuite/ChangeLog:
PR tree-optimization/87756
* gcc.dg/builtin-memchr-2.c: New test.
* gcc.dg/builtin-memchr-3.c: Same.
* gcc.dg/warn-sprintf-no-nul-2.c: Same.
Index: gcc/expr.c
===================================================================
--- gcc/expr.c (revision 265496)
+++ gcc/expr.c (working copy)
@@ -11484,18 +11484,40 @@ string_constant (tree arg, tree *ptr_offset, tree
offset = off;
}
- if (!init || TREE_CODE (init) != STRING_CST)
+ if (!init)
return NULL_TREE;
+ *ptr_offset = offset;
+
+ tree eltype = TREE_TYPE (init);
+ tree initsize = TYPE_SIZE_UNIT (eltype);
if (mem_size)
- *mem_size = TYPE_SIZE_UNIT (TREE_TYPE (init));
+ *mem_size = initsize;
+
if (decl)
*decl = array;
- gcc_checking_assert (tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (init)))
- >= TREE_STRING_LENGTH (init));
+ if (TREE_CODE (init) == INTEGER_CST)
+ {
+ /* For a reference to (address of) a single constant character,
+ store the native representation of the character in CHARBUF. */
+ unsigned char charbuf[MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT];
+ int len = native_encode_expr (init, charbuf, sizeof charbuf, 0);
+ if (len > 0)
+ {
+ /* Construct a string literal with elements of ELTYPE and
+ the representation above. Then strip
+ the ADDR_EXPR (ARRAY_REF (...)) around the STRING_CST. */
+ init = build_string_literal (len, (char *)charbuf, eltype);
+ init = TREE_OPERAND (TREE_OPERAND (init, 0), 0);
+ }
+ }
- *ptr_offset = offset;
+ if (TREE_CODE (init) != STRING_CST)
+ return NULL_TREE;
+
+ gcc_checking_assert (tree_to_shwi (initsize) >= TREE_STRING_LENGTH (init));
+
return init;
}
Index: gcc/tree.c
===================================================================
--- gcc/tree.c (revision 265496)
+++ gcc/tree.c (working copy)
Index: gcc/tree.h
===================================================================
--- gcc/tree.h (revision 265496)
+++ gcc/tree.h (working copy)
@@ -4194,7 +4194,7 @@ extern tree build_call_expr_internal_loc_array (lo
extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
int, ...);
extern tree build_alloca_call_expr (tree, unsigned int, HOST_WIDE_INT);
-extern tree build_string_literal (int, const char *);
+extern tree build_string_literal (int, const char *, tree = char_type_node);
/* Construct various nodes representing data types. */
There's only about a dozen calls to build_string_literal. Instead of
defaulting the argument, just fix them. OK with that change. Make
sure to catch those in config/{rs6000,i386}/ and cp/
Why? Default arguments (and overloading) exist in C++ to deal
with just this case: to avoid having to provide the common
argument value while letting callers provide a different value
when they need to. What purpose will it serve to make these
unnecessary changes and to force new callers to provide
the default argument value? It will only make using
the function more error-prone and its callers harder
to read.
Martin