On 7/22/19 3:33 PM, Jeff Law wrote:
On 7/19/19 4:04 PM, Martin Sebor wrote:
On targets with permissive alignment requirements GCC sometimes
lowers stores of short (between two and 16 bytes), power-of-two
char sequences to single integer stores of the corresponding
width. This happens for sequences of ordinary character stores
as well as for some calls to memcpy.
However, the strlen pass is only prepared to handle character
stores and not those of wider integers. As a result, the strlen
optimization tends to get defeated in cases when it could benefit
the most: very short strings. I counted 1544 instances where
the strlen optimization was disabled in a GCC build on x86_64
due to this sort of early store merging, and over two thousand
in a build of the Linux kernel.
In addition, -Wstringop-overflow only considers calls to string
functions and is ineffective against past-the-end accesses by
these merged multibyte stores.
To improve the effectiveness of both the optimization as well
as the warning the attached patch enhances the strlen pass to
consider these wide accesses. Since the infrastructure for doing
this is already in place (strlen can compute multibyte accesses
via MEM_REFs of character arrays), the enhancement isn't very
intrusive. It relies on the native_encode_expr function to
determine the encoding of an expression and its "length".
I tested the patch on x86_64. I expect the tests the patch
adds will need some adjustment for big-endian and strictly
aligned targets.
Martin
gcc-91183.diff
PR tree-optimization/91183 - strlen of a strcpy result with a conditional
source not folded
PR tree-optimization/86688 - missing -Wstringop-overflow using a non-string
local array in strnlen with excessive bound
gcc/ChangeLog:
PR tree-optimization/91183
PR tree-optimization/86688
* builtins.c (compute_objsize): Handle MEM_REF.
* tree-ssa-strlen.c (class ssa_name_limit_t): New.
(get_min_string_length): Remove.
(count_nonzero_bytes): New function.
(handle_char_store): Rename...
(handle_store): to this. Handle multibyte stores via integer types.
(strlen_check_and_optimize_stmt): Adjust conditional and the called
function name.
gcc/testsuite/ChangeLog:
PR tree-optimization/91183
PR tree-optimization/86688
* gcc.dg/attr-nonstring-2.c: Remove xfails.
* gcc.dg/strlenopt-70.c: New test.
* gcc.dg/strlenopt-71.c: New test.
* gcc.dg/strlenopt-72.c: New test.
* gcc.dg/strlenopt-8.c: Remove xfails.
+
if (TREE_CODE (dest) != ADDR_EXPR)
return NULL_TREE;
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 88b6bd7869e..ca1aca3ed9e 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
+
+/* If the SSA_NAME has already been "seen" return a positive value.
+ Otherwise add it to VISITED. If the SSA_NAME limit has been
+ reached, return a negative value. Otherwise return zero. */
+
+int ssa_name_limit_t::next_ssa_name (tree ssa_name)
Nit. Return type on a different line than the function's name. The
point behind that convention is to make it easier to search for a
function's definition.
+/* Determine the minimum and maximum number of leading non-zero bytes
+ in the representation of EXP and set LENRANGE[0] and LENRANGE[1]
+ to each. Set LENRANGE[2] to the total number of bytes in
+ the representation. Set *NULTREM if the representation contains
+ a zero byte, and set *ALLNUL if all the bytes are zero. Avoid
+ recursing deeper than the limits in SNLIM allow. Return true
+ on success and false otherwise. */
S/NULTREM/NULTERM/
if (si != NULL)
{
- int cmp = compare_nonzero_chars (si, offset);
- gcc_assert (offset == 0 || cmp >= 0);
- if (storing_all_zeros_p && cmp == 0 && si->full_string_p)
+ /* Set to 1 if the string described by SI is being written into
+ before the terminating nul (if it has one), to zero if the nul
+ is being overwritten but not beyond, or negative otherwise. */
+ int store_b4_nul[2];
Umm "store_b4_nul"? Are you really trying to save 3 characters in the
name by writing it this way? :-)
I'm actually saving four characters over "store_before_nul" ;-)
It's just a name I picked because I like it. Would you prefer to
go back to the original 'cmp' instead? It doesn't get much shorter
than that, or less descriptive, especially when there is no comment
explaining what it's for. (FWIW, I renamed it because I found myself
going back to the description of compare_nonzero_chars to remember
what cmp actually meant.)
You've got two entries in the array, but the comment reads as if there's
just one element. What is the difference between the two array elements?
Since handle_store deals with sequences of one or more bytes some
of the optimizations it implements(*) need to consider both where
the first byte of the sequence is stored and where the last one is.
The first element of this array is for the first byte and the second
one is for the last character. The comment a few lines down is meant
to make the distinction clear but we can expand the comment above
the variable even more.
/* The offset of the last stored byte. */
unsigned HOST_WIDE_INT endoff = offset + lenrange[2] - 1;
(lenrange[2] is the size of the store.)
[*] E.g., the one we discussed not so long ago (PR90989) or the one
that removes a store of '\0' over the terminating '\0'.
I didn't see anything terribly concerning so far, but I do want to look
at handle_store again after the comment is fixed so that I'm sure I'm
interpreting things correctly.
Does this help?
/* The corresponding element is set to 1 if the first and last
element, respectively, of the sequence of characters being
written over the string described by SI ends before
the terminating nul (if it has one), to zero if the nul is
being overwritten but not beyond, or negative otherwise. */
Martin