On 6/25/19 3:38 PM, Jeff Law wrote:
On 6/24/19 6:47 PM, Martin Sebor wrote:
On 6/24/19 5:59 PM, Jeff Law wrote:
On 6/24/19 5:50 PM, Martin Sebor wrote:
The strlen enhancement committed in r263018 to handle multi-character
assignments extended the handle_char_store() function to handle such
stores via MEM_REFs. Prior to that the function only dealt with
single-char stores. The enhancement neglected to constrain a case
in the function that assumed the function's previous constraint.
As a result, when the original optimization takes place with
a multi-character store, the function computes the wrong string
length.
The attached patch adds the missing constraint.
Martin
gcc-90989.diff
PR tree-optimization - incorrrect strlen result after second strcpy into
the same destination
gcc/ChangeLog:
* tree-ssa-strlen.c (handle_char_store): Constrain a single
character
optimization to just single character stores.
gcc/testsuite/ChangeLog:
* gcc.dg/strlenopt-26.c: Exit with test result status.
* gcc.dg/strlenopt-67.c: New test.
Index: gcc/tree-ssa-strlen.c
===================================================================
--- gcc/tree-ssa-strlen.c (revision 272618)
+++ gcc/tree-ssa-strlen.c (working copy)
@@ -3462,34 +3462,38 @@ handle_char_store (gimple_stmt_iterator *gsi)
return false;
}
}
- /* If si->nonzero_chars > OFFSET, we aren't overwriting '\0',
- and if we aren't storing '\0', we know that the length of the
- string and any other zero terminated string in memory remains
- the same. In that case we move to the next gimple statement and
- return to signal the caller that it shouldn't invalidate anything.
- This is benefical for cases like:
+ if (cmp > 0
+ && storing_nonzero_p
+ && TREE_CODE (TREE_TYPE (rhs)) == INTEGER_TYPE)
I'm not sure I follow why checking for TREE_CODE (TREE_TYPE (rhs)) ==
INTEGER_TYPE helps here. If you need to check that we're storing bytes,
then don't you need to check the size, not just the TREE_CODE of the
type?
handle_char_store is only called for single-character assignments
or MEM_REF assignments to/from arrays. The type of the RHS is only
integer when storing a single character.
I don't see any requirement here that INTEGER_TYPE implies a single byte
though. That seems to be true in simple tests I've tried, but what's to
stop us from using something like 0x31323300 on the RHS for a big endian
machine to store "123"?
The caller ensures that handle_char_store is only called for stores
to arrays (MEM_REF) or single elements as wide as char.
What you describe sounds like
char a[N];
*(int*)a = 0x31323300;
which is represented as
MEM[(int *)&b] = 825373440;
The LHS type of that is int so the function doesn't get called.
And if the NUL byte in the original was at byte offset 2, then didn't we
just change the length by overwriting where the NUL is?
No, because cmp is the result of compare_nonzero_chars and cmp > 0
means:
1 if SI is known to start with more than OFF nonzero characters
i.e., the character is being stored before the terminating nul.
This is the basis of the original optimization:
/* If si->nonzero_chars > OFFSET, we aren't overwriting '\0',
and if we aren't storing '\0', we know that the length of the
string and any other zero terminated string in memory remains
the same.
ISTM you actually have to look at contents of the RHS object, not just
its type.
That's already been done earlier by calling initializer_zerop.
The function sets storing_nonzero_p when the sequence of
characters in RHS is not all zeros. For single integers it
calls integer_zerop. Since the type of the RHS is an integer
we know the RHS is a single non-zero byte.
Martin