Sorry, Dave,
what should I do with this patch? Bernd. On 11/9/18 5:52 PM, Bernd Edlinger wrote: > Hi Dave, > > is the patch OK, or do you still have questions? > > > Thanks > Bernd. > > On 11/2/18 10:48 PM, Bernd Edlinger wrote: >> On 11/2/18 9:40 PM, David Malcolm wrote: >>> On Sun, 2018-08-05 at 16:59 +0000, Bernd Edlinger wrote: >>>> Hi! >>>> >>>> >>>> My other patch with adds assertions to varasm.c regarding correct >>>> nul termination of sting literals did make these incorrect string >>>> constants in JIT frontend fail. >>>> >>>> The string constants are not nul terminated if their length exceeds >>>> 200 characters. The test cases do not use strings of that size where >>>> that would make a difference. But using a fixed index type is >>>> clearly >>>> wrong. >>>> >>>> This patch removes the fixed char[200] array type from >>>> playback::context, >>>> and uses build_string_literal instead of using build_string directly. >>>> >>>> >>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >>>> Is it OK for trunk? >>> >>> Sorry for the belated response. >>> >>> Was this tested with --enable-host-shared and --enable-languages=jit ? >>> Note that "jit" is not included in --enable-languages=all. >>> >> >> Yes, of course. The test suite contains a few string constants, just >> all of them are shorter than 200 characters. But I think removing this >> artificial limit enables the existing test cases to test that the >> shorter string is in fact zero terminated. >> >>> The patch seems reasonable, but I'm a little confused over the meaning >>> of "len" in build_string_literal and build_string: does it refer to the >>> length or the size of the string? >>> >> >> build_string_literal: >> For languages that use zero-terminated strings, len is strlen(str)+1, and >> str is a zero terminated single-byte character string. >> For languages that don't use zero-terminated strings, len is the size of >> the string and str is not zero terminated. >> >> build_string: >> constructs a STRING_CST tree object, which is usable as is in some contexts, >> like for asm constraints, but as a string literal it is incomplete, and >> needs an index type. The index type defines the memory size which must >> be larger than the string precision. Excess memory is implicitly cleared. >> >> This means currently all jit strings shorter than 200 characters >> are filled with zero up to the limit of 200 chars as imposed by >> m_char_array_type_node. Strings of exactly 200 chars are not zero >> terminated, >> and larger strings should result in an assertion (excess precision was >> previously >> allowed, but no zero termination was appended, when that is not part of >> the original string constant). >> >> Previously it was allowed to have memory size less than the string len, which >> had complicated the STRING_CST semantics in the middle-end, but with the >> string_cst semantic rework I did for gcc-9 this is no longer allowed and >> results in (checking) assertions in varasm.c. >> >>>> @@ -617,16 +616,9 @@ playback::rvalue * >>>> playback::context:: >>>> new_string_literal (const char *value) >>>> { >>>> - tree t_str = build_string (strlen (value), value); >>>> - gcc_assert (m_char_array_type_node); >>>> - TREE_TYPE (t_str) = m_char_array_type_node; >>>> - >>>> - /* Convert to (const char*), loosely based on >>>> - c/c-typeck.c: array_to_pointer_conversion, >>>> - by taking address of start of string. */ >>>> - tree t_addr = build1 (ADDR_EXPR, m_const_char_ptr, t_str); >>>> + tree t_str = build_string_literal (strlen (value) + 1, value); >>>> - return new rvalue (this, t_addr); >>>> + return new rvalue (this, t_str); >>>> } >>> >>> In the above, the call to build_string with strlen is replaced with >>> build_string_literal with strlen + 1. >>> >>> >>> build_string's comment says: >>> >>> "Note that for a C string literal, LEN should include the trailing >>> NUL." >>> >>> but has: >>> >>> length = len + offsetof (struct tree_string, str) + 1; >>> >>> and: >>> >>> TREE_STRING_LENGTH (s) = len; >>> memcpy (s->string.str, str, len); >>> s->string.str[len] = '\0'; >>> >>> suggesting that the "len" parameter is in fact the length *without* the >>> trailing NUL, and that a trailing NUL is added by build_string. >>> >> >> Yes, string constants in tree objects have another zero termiation, >> but varasm.c does something different, there the index range takes >> precedence. >> The index range is built in build_string_literal as follows: >> >> elem = build_type_variant (char_type_node, 1, 0); >> index = build_index_type (size_int (len - 1)); >> type = build_array_type (elem, index); >> >> therefore the string constant hast the type char[0..len-1] >> thus only len bytes are significant for code generation, the extra >> nul is just for "convenience". >> >>> However build_string_literal has: >>> >>> t = build_string (len, str); >>> elem = build_type_variant (char_type_node, 1, 0); >>> index = build_index_type (size_int (len - 1)); >>> >>> suggesting that the len is passed directly to build_string (and thus >>> ought to be strlen), but the build_index_type uses len - 1 (which >>> suggests that len is the size of the string, rather than its length). >>> >>> What's the intended meaning of len in these functions? >>> >> >> I hope this helps. >> >> >> Thanks >> Bernd. >> >> >>> Thanks >>> Dave >>>