https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192

--- Comment #10 from Matt Whitlock <gcc at mattwhitlock dot name> ---
(In reply to Rahul from comment #9)
> I am also experiencing the same issue. Is there any solution for it?

You can wrap a preprocessor macro around string literals that you want to
subject to the linker's garbage collection:

  #define GCSTR(str) ({ static const char __str[] = str; __str; })

  void hello() {
      puts(GCSTR("111")); // NOT in .rodata
      puts("222");        //     in .rodata
  }

  int main() {
      puts(GCSTR("333")); //     in .rodata
      puts("444");        //     in .rodata
      return 0;
  }

$ gcc -ffunction-sections -fdata-sections -Wl,--gc-sections -o gcstr gcstr.c

$ objdump -s -j .rodata gcstr

  gcstr:     file format elf64-x86-64

  Contents of section .rodata:
   4005fd 32323200 34343400 33333300           222.444.333.    

The downside of this strategy, however, is that these strings then become
ineligible for merging, so if you have multiple *reachable* occurrences of the
same GCSTR in your code, then you'll have multiple copies of the string data in
the .rodata section of your linked binary.

These redundant copies would not be present if the compiler were correctly
outputting literal-initialized constant character arrays to sections with the
"merge" and "strings" flags set (which it should do only if
-fmerge-all-constants is set). You can simulate how this could/should work by
editing the compiler's assembly output so that it sets the section flags
appropriately.

Given this program, gcstr.c:

  #define GCSTR(str) ({ static const char __str[] = str; __str; })

  int main() {
      puts(GCSTR("111"));
      puts(GCSTR("111"));
      puts("111");
      return 0;
  }

Compile (but do not assemble) the program:

$ gcc -S -ffunction-sections -fdata-sections -fmerge-all-constants -o gcstr.s
gcstr.c

Edit the assembly code so that all .rodata.__str.* sections are declared with
the "merge" and "strings" flags and an entity size of 1:

$ sed -e
's/\(\.section\t\.rodata\.__str\..*\),"a",\(@progbits\)$/\1,"aMS",\2,1/' -i
gcstr.s

Now assemble and link the program:

$ gcc -Wl,--gc-sections -o gcstr gcstr.s

Dumping the .rodata section from the resulting executable reveals that the
linker did correctly perform string merging.

$ objdump -s -j .rodata gcstr

  gcstr:     file format elf64-x86-64

  Contents of section .rodata:
   40060d 31313100                             111.            

Compare the above objdump output to that which results when skipping the sed
step:

   40060d 31313100 31313100 31313100           111.111.111.    

The needed correction is that the compiler should, when -fmerge-all-constants
is set, emit literal-initialized constant character array data to a section
with flags "aMS" and entsize==sizeof(T), where T is the type of characters in
the array.

A further correction (and really the main request in this bug report) would be
for the compiler to emit string literals to discrete sections when
-fdata-sections is set.

Reply via email to