Hi! Both C and C++ FE diagnose arrays larger than half of the address space: /tmp/1.c:1:6: error: size of array ‘a’ is too large char a[__SIZE_MAX__ / 2 + 1]; ^ because one can't do pointer arithmetics on them. But we don't have anything similar for string literals. As internally we use host int as TREE_STRING_LENGTH, this is relevant to targets that have < 32-bit size_t only.
The following patch adds that diagnostics and truncates the string literals. Bootstrapped/regtested on x86_64-linux and i686-linux and tested with a cross to avr. I'll defer adjusting testcases to the maintainers of 16-bit ports. From the PR it seems gcc.dg/concat2.c, g++.dg/parse/concat1.C and pr46534.c tests are affected. Ok for trunk? 2018-11-16 Jakub Jelinek <ja...@redhat.com> PR middle-end/87854 * c-common.c (fix_string_type): Reject string literals larger than TYPE_MAX_VALUE (ssizetype) bytes. --- gcc/c-family/c-common.c.jj 2018-11-14 13:37:46.921050615 +0100 +++ gcc/c-family/c-common.c 2018-11-15 15:20:31.138056115 +0100 @@ -737,31 +737,44 @@ tree fix_string_type (tree value) { int length = TREE_STRING_LENGTH (value); - int nchars; + int nchars, charsz; tree e_type, i_type, a_type; /* Compute the number of elements, for the array type. */ if (TREE_TYPE (value) == char_array_type_node || !TREE_TYPE (value)) { - nchars = length; + charsz = 1; e_type = char_type_node; } else if (TREE_TYPE (value) == char16_array_type_node) { - nchars = length / (TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT); + charsz = TYPE_PRECISION (char16_type_node) / BITS_PER_UNIT; e_type = char16_type_node; } else if (TREE_TYPE (value) == char32_array_type_node) { - nchars = length / (TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT); + charsz = TYPE_PRECISION (char32_type_node) / BITS_PER_UNIT; e_type = char32_type_node; } else { - nchars = length / (TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT); + charsz = TYPE_PRECISION (wchar_type_node) / BITS_PER_UNIT; e_type = wchar_type_node; } + /* This matters only for targets where ssizetype has smaller precision + than 32 bits. */ + if (wi::lts_p (wi::to_wide (TYPE_MAX_VALUE (ssizetype)), length)) + { + error ("size of string literal is too large"); + length = tree_to_shwi (TYPE_MAX_VALUE (ssizetype)) / charsz * charsz; + char *str = CONST_CAST (char *, TREE_STRING_POINTER (value)); + memset (str + length, '\0', + MIN (TREE_STRING_LENGTH (value) - length, charsz)); + TREE_STRING_LENGTH (value) = length; + } + nchars = length / charsz; + /* C89 2.2.4.1, C99 5.2.4.1 (Translation limits). The analogous limit in C++98 Annex B is very large (65536) and is not normative, so we do not diagnose it (warn_overlength_strings is forced off Jakub