On 07/19/2018 12:19 AM, Bernd Edlinger wrote:
if (TREE_CODE (idx) != INTEGER_CST
&& TREE_CODE (argtype) == POINTER_TYPE)
{
/* From a pointer (but not array) argument extract the variable
index to prevent get_addr_base_and_unit_offset() from failing
due to it. Use it later to compute the non-constant offset
into the string and return it to the caller. */
varidx = idx;
ref = TREE_OPERAND (arg, 0);
tree type = TREE_TYPE (arg);
if (TREE_CODE (type) == ARRAY_TYPE
&& TREE_CODE (type) != INTEGER_TYPE)
return NULL_TREE;
}
the condition TREE_CODE(type) == ARRAY_TYPE
&& TREE_CODE (type) != INTEGER_TYPE looks funny.
Check for ARRAY_TYPE should imply != INTEGER_TYPE.
Yes, that other test was superfluous. I've removed it in
the updated patch I just posted.
else if (DECL_P (arg))
{
array = arg;
chartype = TREE_TYPE (arg);
}
chartype is only used in the if (varidx) block, but that is always zero
in this case.
True. Chartype could be left uninitialized. I did it mostly
out of an abundance of caution (I don't like leaving variables
with such big scope uninitialized). In the latest update
I instead initialize chartype to null.
while (TREE_CODE (chartype) == ARRAY_TYPE
|| TREE_CODE (chartype) == POINTER_TYPE)
chartype = TREE_TYPE (chartype);
you multiply sizeof(chartype) with varidx but you should probably
use the type of the TREE_OPERAND (arg, 0); above instead.
The offset returned to the caller is relative to the character
array so varidx must also be the index into the same array.
Otherwise it cannot be used. The difference is this:
const char a[][4] = { "12", "123" };
int x = strlen (&a[0][i]); // use i as the offset
int y = strlen (&a[i][0]); // bail
this is not in the patch, but I dont like it at all, because it compares the
size of a single initializer against the full size of the array. But it should
be the innermost enclosing array:
tree array_size = DECL_SIZE_UNIT (array);
if (!array_size || TREE_CODE (array_size) != INTEGER_CST)
return NULL_TREE;
/* Avoid returning a string that doesn't fit in the array
it is stored in, like
const char a[4] = "abcde";
but do handle those that fit even if they have excess
initializers, such as in
const char a[4] = "abc\000\000";
The excess elements contribute to TREE_STRING_LENGTH()
but not to strlen(). */
unsigned HOST_WIDE_INT length
= strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init));
if (compare_tree_int (array_size, length + 1) < 0)
return NULL_TREE;
The use of strnlen here isn't right for wide strings and needs
to be replaced with a call to string_length() from builtins.c.
I've made that change in the updated patch.
The remaining concern is orthogonal to the changes to fix
the wrong code. As I mentioned, I opened a few bugs to
improve things in this area: 86434, 86572, 86552. As
the first step I'm about to post a solution for 86552.
consider the following test case:
$ cat part.c
const char a[2][3][8] = { { "a", "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb", "ccc"},
{ "dddd", "eeeee", "ffffff" } };
int main ()
{
int n = __builtin_strlen (&a[0][1][0]);
if (n == 30)
__builtin_abort ();
}
With my patch for pr86552 GCC prints:
warning: initializer-string for array of chars is too long
const char a[2][3][8] = { { "a", "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb", "ccc"},
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: (near initialization for ‘a[0][1]’)
In function ‘main’:
warning: ‘__builtin_strlen’ argument missing terminating nul
[-Wstringop-overflow=]
int n = __builtin_strlen (&a[0][1][0]);
^~~~~~~~~~~~~~~~
note: referenced argument declared here
const char a[2][3][8] = { { "a", "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb", "ccc"},
We can discuss what value, if any, might be more appropriate
to fold the length to in these undefined cases but I would
prefer to have that discussion separately from this review.
Martin