On Wed, May 17, 2017 at 07:52:38AM +0100, Richard Sandiford wrote:
> 2017-05-17 Richard Sandiford <[email protected]>
>
> gcc/
> * tree-ssa-strlen.c (strinfo): Rename the length field to
> nonzero_chars. Add a full_string_p field.
> (compare_nonzero_chars, zero_length_string_p): New functions.
> (get_addr_stridx): Add an offset_out parameter.
> Use compare_nonzero_chars.
> (get_stridx): Update accordingly. Use compare_nonzero_chars.
> (new_strinfo): Update after above changes to strinfo.
> (set_endptr_and_length): Set full_string_p.
> (get_string_length): Update after above changes to strinfo.
> (unshare_strinfo): Update call to new_strinfo.
> (maybe_invalidate): Likewise.
> (get_stridx_plus_constant): Change off to unsigned HOST_WIDE_INT.
> Use compare_nonzero_chars and zero_string_p. Treat nonzero_chars
> as a uhwi instead of an shwi. Update after above changes to
> strinfo and new_strinfo.
> (zero_length_string): Assert that chainsi contains full strings.
> Use zero_length_string_p. Update call to new_strinfo.
> (adjust_related_strinfos): Update after above changes to strinfo.
> Copy full_string_p from origsi.
> (adjust_last_stmt): Use zero_length_string_p.
> (handle_builtin_strlen): Update after above changes to strinfo and
> new_strinfo. Install the lhs as the string length if the previous
> entry didn't describe a full string.
> (handle_builtin_strchr): Update after above changes to strinfo
> and new_strinfo.
> (handle_builtin_strcpy): Likewise.
> (handle_builtin_strcat): Likewise.
> (handle_builtin_malloc): Likewise.
> (handle_pointer_plus): Likewise.
> (handle_builtin_memcpy): Likewise. Track nonzero characters
> that aren't necessarily followed by a nul terminator.
> (handle_char_store): Likewise.
>
> gcc/testsuite/
> * gcc.dg/strlenopt-32.c: New testcase.
> * gcc.dg/strlenopt-33.c: Likewise.
> * gcc.dg/strlenopt-33g.c: Likewise.
> * gcc.dg/strlenopt-34.c: Likewise.
> * gcc.dg/strlenopt-35.c: Likewise.
Ok, with a small nit.
> @@ -501,8 +550,8 @@ set_endptr_and_length (location_t loc, s
> static tree
> get_string_length (strinfo *si)
> {
> - if (si->length)
> - return si->length;
> + if (si->nonzero_chars)
> + return si->full_string_p ? si->nonzero_chars : NULL;
This should be NULL_TREE.
>
> if (si->stmt)
> {
> @@ -595,19 +644,19 @@ get_string_length (strinfo *si)
> for (strinfo *chainsi = verify_related_strinfos (si);
> chainsi != NULL;
> chainsi = get_next_strinfo (chainsi))
> - if (chainsi->length == NULL)
> + if (chainsi->nonzero_chars == NULL)
and this actually too (though it is preexisting).
For future work, it would be nice if we could handle not just
memcpy and single character stores, but also cases where a memcpy
is folded into a store of couple of adjacent bytes, say
MEM_REF[ptr, 0] = 0x12345678;
is storing 4 non-zero bytes, while = 0x345678; would be zero nonzero_chars +
full_string_p for big endian and 3 non-zero bytes plus zero byte on little
endian. One could use native_encode_expr on the rhs and then determine the
nonzero count at the start and optional presence of a zero char afterwards.
Jakub