On 08/17/2018 07:53 AM, Bernd Edlinger wrote: > On 08/17/18 15:38, Richard Biener wrote: >> On Fri, 17 Aug 2018, Bernd Edlinger wrote: >> >>> On 08/17/18 14:19, Richard Biener wrote: >>>> On Fri, 17 Aug 2018, Bernd Edlinger wrote: >>>> >>>>> Richard Biener wrote: >>>>>> +embedded @code{NUL} characters. However, the >>>>>> +@code{TREE_STRING_LENGTH} always includes a trailing @code{NUL} that >>>>>> +is not part of the language string literal but appended by the front >>>>>> end. >>>>>> +If the string shall not be @code{NUL}-terminated the @code{TREE_TYPE} >>>>>> +is one character shorter than @code{TREE_STRING_LENGTH}. >>>>>> +Excess caracters other than one trailing @code{NUL} character are not >>>> >>>> characters btw. >>>> >>> >>> thanks, updated. >>> >>>> I read the above that the string literal for >>>> >>>> char x[2] = "1"; >>>> >>>> is actually "1\0\0" - there's one NUL that is not part of the language >>>> string literal. The second sentence then suggests that both \0 >>>> are removed because 2 is less than 3? >>>> >>> >>> maybe 2 is a bad example, lets consider: >>> char x[2000000000] = "1"; >>> >>> That is a string_cst with STRING_LENGTH = 2, content = "2\0\0" >>> the array_type is used on both x, and the string_cst, >>> I was assuming that both tree objects refer to the same type object. >>> which is char[0..2000000000-1] >> >> Oh, didn't realize we use char[200000000] for the STRING_CST. Makes >> my suggestion to use char[] instead not (very) much worse than the >> existing practice then. >> >>> varasm assembles the bytes that are given by STRING_LENGTH >>> and appends zeros as appropriate. >>> >>>> As said, having this extra semantics of a STRING_CST tied to >>>> another tree node (its TREE_TYPE) looks ugly. >>>> >>>>>> +permitted. >>>>>> >>>>>> I find this very confusing and oppose to that change. Can we get >>>>>> back to the drawing board please? If we want an easy way to >>>>>> see whether a string is "properly" terminated then maybe we can >>>>>> simply use a flag that gets set by build_string? >>>>>> >>>>> >>>>> What I mean with that is the case like >>>>> char x[2] = "123456"; >>>>> >>>>> which is build_string(7, "123456"), but with a type char[2], >>>>> so varasm throws away "3456\0". >>>> >>>> I think varasm throws away chars not because of the type of >>>> the STRING_CST but because of the available storage in x. >>>> >>> >>> But at other places we look at the type of the string_cst, don't we? >>> Shouldn't those be the same? >> >> I think most (all?) places look at TREE_TYPE (TREE_TYPE (string)) >> only. I'm not aware of users of the array domain of the array type >> of a string - but I'm far from knowing GCC inside-out ;) >> > > Yes, I know, that happens to me as well on the first day after my holidays ;) I'm not aware of anything using the array domain of the array type of a string. But there's still a lot of old code in tree.c and expr.c. Who knows what would be found with a thorough audit.
jeff