Hi Bernd, Things are moving fast so I'll answer your messages incrementally :-)
> On 3 Aug 2018, at 20:37, Bernd Edlinger <bernd.edlin...@hotmail.de> wrote: [...] > Oh, how interesting. > > My patch only intends to add a dummy byte that is stripped again > in the assembly. The purpose is to make it trivial to find out if > a string constant is NUL-terminated in the middle-end by comparing > TYPE_SIZE_UNIT (TREE_TYPE (decl)) with TREE_STRING_LENGTH (decl). Ah, Ok. I hadn't quite realized that yet. > TYPE_SIZE_UNIT >= TREE_STRING_LENGTH means zero terminated > TYPE_SIZE_UNIT < TREE_STRING_LENGTH means not necessarily zero terminated, That is opposite to the situation we were having for Ada before the patch series: > Such strings shall never chop off more than one nul character. > Ada strings are generally not zero terminated, right? Right. Strings have explicit bounds in Ada, and a NUL character is like any other. It can appear anywhere. For example, the excerpt below constructs a string of 5 characters, bounds 1 to 5, with a nul character in second position: X : String (1 .. 5) := "1" & Ascii.Nul & "345"; ('&' is the catenation operator) The problem we had was that string literals not explicitly NUL terminated (as in X : String := "123" & Ascii.NUL) would never go in mergeable sections. For example: Ada.Text_IO.Put_Line ("Hello World!"); Ada has so called "generic" units that are sometimes instanciated many many times so there is real interest in improving the situation there. > So in your search loop you would not have to look after > type_len, because that byte would be guaranteed to be zero. > > That is if we agree that we want to restrict the string constants > in that way as I proposed. > > In the case str_len > type_len I do not understand if that is right. In your new model, I don't know what sense it would make, indeed. In the previous model, it was clearly expected to be a possibility. > Because varasm will only output type_size bytes, > this seems only to select a string section > but the last nul byte will not be assembled. > Something that is not contained in the type_size > should not be assembled. > > What exactly do you want to accomplish? Based on the previous (say, gcc-7 or gcc-8) code base, arrange for most Ada string literals to end up in a mergeable section on ELF targets. In at least gcc-7 and gcc-8, our gigi adjustment + the patch I posted achieve this effect. For example, for the following source: procedure Hello is procedure Process (S : String) with Import, Convention => C; begin Process ("1234"); end; Without the gigi change, I get (x86_64-linux, -O2 -fmerge-all-constants): .section .rodata .LC0: .ascii "1234" Regular section, no terminating zero. The gigi change alone gets us the terminating nul (*), but not yet a mergeable section, because the extra nul isn't covered by the type bounds: .section .rodata .LC0: .string "1234" With the varasm change on top, checking beyond the type bounds when the string constant isn't an initializer, we get: .section .rodata.str1.1,"aMS",@progbits,1 .LC0: .string "1234" (*) The terminating zero, part of the string value, not spanned by the type bounds, gets assembled through: assemble_constant_contents (...) ... size = get_constant_size (exp); then: get_constant_size (tree exp) size = int_size_in_bytes (TREE_TYPE (exp)); if (TREE_CODE (exp) == STRING_CST) size = MAX (TREE_STRING_LENGTH (exp), size); Again, this is just providing more context on the original patch I posted, based on your questions. I understand there are proposals to change things pretty significantly in this area, so ... now on to your next messages and ideas :-) Olivier