Re: CREL relocation format for ELF (was: RELLEB)

2024-03-28 Thread Fangrui Song via Gcc
On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song  wrote:
>
> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
> >
> > The relocation formats REL and RELA for ELF are inefficient. In a
> > release build of Clang for x86-64, .rela.* sections consume a
> > significant portion (approximately 20.9%) of the file size.
> >
> > I propose RELLEB, a new format offering significant file size
> > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >
> > Your thoughts on RELLEB are welcome!
> >
> > Detailed analysis:
> > https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> > generic ABI (ELF specification):
> > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
> > binutils feature request: 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=31475
> > LLVM: 
> > https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
> >
> > Implementation primarily involves binutils changes. Any volunteers?
> > For GCC, a driver option like -mrelleb in my Clang prototype would be
> > needed. The option instructs the assembler to use RELLEB.
>
> The format was tentatively named RELLEB. As I refine the original pure
> LEB-based format, “RELLEB” might not be the most fitting name.
>
> I have switched to SHT_CREL/DT_CREL/.crel and updated
> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> and
> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
>
> The new format is simpler and better than RELLEB even in the absence
> of the shifted offset technique.
>
> Dynamic relocations using CREL are even smaller than Android's packed
> relocations.
>
> // encodeULEB128(uint64_t, raw_ostream &os);
> // encodeSLEB128(int64_t, raw_ostream &os);
>
> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
> uint32_t symidx = 0, type = 0;
> for (const Reloc &rel : relocs)
>   offsetMask |= crels[i].r_offset;
> int shift = std::countr_zero(offsetMask)
> encodeULEB128(relocs.size() * 4 + shift, os);
> for (const Reloc &rel : relocs) {
>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
>   (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
>   if (deltaOffset < 0x10) {
> os << char(b);
>   } else {
> os << char(b | 0x80);
> encodeULEB128(deltaOffset >> 4, os);
>   }
>   if (b & 1) {
> encodeSLEB128(static_cast(rel.r_symidx - symidx), os);
> symidx = rel.r_symidx;
>   }
>   if (b & 2) {
> encodeSLEB128(static_cast(rel.r_type - type), os);
> type = rel.r_type;
>   }
>   if (b & 4) {
> encodeSLEB128(std::make_signed_t(rel.r_addend - addend), os);
> addend = rel.r_addend;
>   }
> }
>
> ---
>
> While alternatives like PrefixVarInt (or a suffix-based variant) might
> excel when encoding larger integers, LEB128 offers advantages when
> most integers fit within one or two bytes, as it avoids the need for
> shift operations in the common one-byte representation.
>
> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
> is inferior to or on par with SLEB128 for one-byte encodings.


We can introduce a gas option --crel, then users can specify `gcc
-Wa,--crel a.c` (-flto also gets -Wa, options).

I propose that we add another gas option --implicit-addends-for-data
(does the name look good?) to allow non-code sections to use implicit
addends to save space
(https://sourceware.org/PR31567).
Using implicit addends primarily benefits debug sections such as
.debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
data sections such as .eh_frame, .data., .data.rel.ro, .init_array.

-Wa,--implicit-addends-for-data can be used on its own (6.4% .o
reduction in a clang -g -g0 -gpubnames build)   or together with
CREL to achieve more incredible size reduction, one single byte for
most .debug_* relocations!
With CREL, concerns of debug section relocations will become a thing
of the past.


Re: [RFC] add regenerate Makefile target

2024-03-28 Thread Jens Remus via Gcc

Am 27.03.2024 um 19:14 schrieb Christophe Lyon:

On Tue, 26 Mar 2024 at 16:42, Jens Remus  wrote:

Am 15.03.2024 um 09:50 schrieb Christophe Lyon:

On Thu, 14 Mar 2024 at 19:10, Simon Marchi  wrote:

On 2024-03-13 04:02, Christophe Lyon via Gdb wrote:

...

There's just the issue of files that are generated using tools that are
compiled.  When experimenting with maintainer mode the other day, I
stumbled on the opcodes/i386-gen, for instance.  I don't have a good
solution to that, except to rewrite these tools in a scripting language
like Python.


So for opcodes, it currently means rewriting such programs for i386,
aarch64, ia64 and luckily msp430/rl78/rx share the same opc2c
generator.
Not sure how to find volunteers?


Why are those generated source files checked into the repository and not
generated at build-time? Would there be a reason for s390 do so as well
(opcodes/s390-opc.tab is generated at build-time from
opcodes/s390-opc.txt using s390-mkopc built from opcodes/s390-mkopc.c)?


I remember someone mentioned a requirement of being able to rebuild
with the sources on a read-only filesystem.
I don't know if there's a requirement that such generated files should
be part of the source tree though. Is opcodes/s390-opc.tab in builddir
or in srcdir?

I think there are other motivations but I can't remember them at the moment :-)


Thank you for the insights, Christophe! opcodes/s390-opc.tab is 
generated in builddir.


Regards,
Jens
--
Jens Remus
Linux on Z Development (D3303) and z/VSE Support
+49-7031-16-1128 Office
jre...@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des 
Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der 
Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294

IBM Data Privacy Statement: https://www.ibm.com/privacy/


Re: Building Single Tree for a Specific Set of CFLAGS

2024-03-28 Thread Christophe Lyon via Gcc




On 3/27/24 20:07, Joel Sherrill wrote:



On Wed, Mar 27, 2024 at 3:53 AM Christophe Lyon via Gcc > wrote:


Hi!

On 3/26/24 22:52, Joel Sherrill via Gcc wrote:
 > Hi
 >
 > For RTEMS, we normally build a multilib'ed gcc+newlib, but I have
a case
 > where the CPU model is something not covered by our multilibs and
not one
 > we are keen to add. I've looked around but not found anything
that makes me
 > feel confident.
 >
 > What's the magic for building a gcc+newlib with a single set of
libraries
 > that are built for a specific CPU CFLAGS?
 >
 > I am trying --disable-multlibs on the gcc configure and adding
 > CFLAGS_FOR_TARGET to make.
 >
 > Advice appreciated.
 >

I would configure GCC with --disable-multilibs --with-cpu=XXX
--with-mode=XXX --with-float=XXX [maybe --with-fpu=XXX]
This way GCC defaults to what you want.


Thanks. Is there any documentation or even a good example? I found
--with-mode=[arm|thumb] but am having trouble mapping the others back
to GCC options.


I don't know of any good doc/example.
I look in gcc/config.gcc to check what is supported.



I have this for CFLAGS_FOR_TARGET

"-mcpu=cortex-m7 -mthumb -mlittle-endian -mfloat-abi=hard 
-mfpu=fpv5-sp-d16 -march=armv7e-m+fpv5"


I think that means...

--with-mode=thumb   for -mthumb
--with-cpu=cortex-m7 for -mcortex-m7
--with-float=hard         for -mfloat-abi=hard

That leaves a few options I don't know how to map.


You can see that for arm:
supported_defaults="arch cpu float tune fpu abi mode tls"
so there's a --with-XXX for any of the above, meaning that there's no 
--with-endian (default endianness on arm is derived from the target 
triplet eg. armeb-* vs arm-*)


Also note that config.gcc checks that you don't provide both
--with-cpu and --with-arch
or --with-cpu and --with-tune

HTH,

Christophe


--joel


Thanks,

Christophe


 > Thanks.
 >
 > --joel



Re: CREL relocation format for ELF

2024-03-28 Thread Jan Beulich via Gcc
On 28.03.2024 08:43, Fangrui Song wrote:
> On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song  wrote:
>>
>> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
>>>
>>> The relocation formats REL and RELA for ELF are inefficient. In a
>>> release build of Clang for x86-64, .rela.* sections consume a
>>> significant portion (approximately 20.9%) of the file size.
>>>
>>> I propose RELLEB, a new format offering significant file size
>>> reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
>>>
>>> Your thoughts on RELLEB are welcome!
>>>
>>> Detailed analysis:
>>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
>>> generic ABI (ELF specification):
>>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
>>> binutils feature request: 
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=31475
>>> LLVM: 
>>> https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
>>>
>>> Implementation primarily involves binutils changes. Any volunteers?
>>> For GCC, a driver option like -mrelleb in my Clang prototype would be
>>> needed. The option instructs the assembler to use RELLEB.
>>
>> The format was tentatively named RELLEB. As I refine the original pure
>> LEB-based format, “RELLEB” might not be the most fitting name.
>>
>> I have switched to SHT_CREL/DT_CREL/.crel and updated
>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
>> and
>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
>>
>> The new format is simpler and better than RELLEB even in the absence
>> of the shifted offset technique.
>>
>> Dynamic relocations using CREL are even smaller than Android's packed
>> relocations.
>>
>> // encodeULEB128(uint64_t, raw_ostream &os);
>> // encodeSLEB128(int64_t, raw_ostream &os);
>>
>> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
>> uint32_t symidx = 0, type = 0;
>> for (const Reloc &rel : relocs)
>>   offsetMask |= crels[i].r_offset;
>> int shift = std::countr_zero(offsetMask)
>> encodeULEB128(relocs.size() * 4 + shift, os);
>> for (const Reloc &rel : relocs) {
>>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
>>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
>>   (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 
>> 0);
>>   if (deltaOffset < 0x10) {
>> os << char(b);
>>   } else {
>> os << char(b | 0x80);
>> encodeULEB128(deltaOffset >> 4, os);
>>   }
>>   if (b & 1) {
>> encodeSLEB128(static_cast(rel.r_symidx - symidx), os);
>> symidx = rel.r_symidx;
>>   }
>>   if (b & 2) {
>> encodeSLEB128(static_cast(rel.r_type - type), os);
>> type = rel.r_type;
>>   }
>>   if (b & 4) {
>> encodeSLEB128(std::make_signed_t(rel.r_addend - addend), os);
>> addend = rel.r_addend;
>>   }
>> }
>>
>> ---
>>
>> While alternatives like PrefixVarInt (or a suffix-based variant) might
>> excel when encoding larger integers, LEB128 offers advantages when
>> most integers fit within one or two bytes, as it avoids the need for
>> shift operations in the common one-byte representation.
>>
>> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
>> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
>> is inferior to or on par with SLEB128 for one-byte encodings.
> 
> 
> We can introduce a gas option --crel, then users can specify `gcc
> -Wa,--crel a.c` (-flto also gets -Wa, options).
> 
> I propose that we add another gas option --implicit-addends-for-data
> (does the name look good?) to allow non-code sections to use implicit
> addends to save space
> (https://sourceware.org/PR31567).
> Using implicit addends primarily benefits debug sections such as
> .debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
> data sections such as .eh_frame, .data., .data.rel.ro, .init_array.
> 
> -Wa,--implicit-addends-for-data can be used on its own (6.4% .o
> reduction in a clang -g -g0 -gpubnames build)

And this option will the switch from RELA to REL relocation sections,
effectively in violation of most ABIs I'm aware of?

Furthermore, why just data? x86 at least could benefit almost as much
for code. Hence maybe better --implicit-addends=data, with an
option for architectures to also permit --implicit-addends=text.

Jan

>   or together with
> CREL to achieve more incredible size reduction, one single byte for
> most .debug_* relocations!
> With CREL, concerns of debug section relocations will become a thing
> of the past.



A problem about g++ 4.8.5

2024-03-28 Thread shaoben zhu via Gcc
I compile my program using g++ 4.8.5, I find that when my program exits, it
first deconstructs the static member variables of class A, and then
deconstructs a global object of class A. This caused an error in my program.
Could you tell me how can I avoid this problem?Upgrade compiler
version?Modify my code?

my code like this:
class A{
static int var;
~A();   //A  Destructor depended var
};

int A::var;
A   obj;

var deconstructs before obj


Re: A problem about g++ 4.8.5

2024-03-28 Thread Jonathan Wakely via Gcc
Hello,
This mailing list is for discussion the development of GCC itself.
Please use the gcc-help mailing list for help questions. Please send
any replies to that list instead of this one, thanks.

On Thu, 28 Mar 2024 at 09:35, shaoben zhu via Gcc  wrote:
>
> I compile my program using g++ 4.8.5, I find that when my program exits, it

That version is ancient and has not been supported by the GCC project
for many years. You might be able to get support for it from a vendor
who provided you with that version.

> first deconstructs the static member variables of class A, and then
> deconstructs a global object of class A. This caused an error in my program.
> Could you tell me how can I avoid this problem?Upgrade compiler
> version?Modify my code?
>
> my code like this:
> class A{
> static int var;
> ~A();   //A  Destructor depended var
> };
>
> int A::var;
> A   obj;
>
> var deconstructs before obj

That's not what I see using g++ 4.8.5 on CentOS 6. With the following code:

struct M { ~M() { __builtin_puts("~M"); } };
struct A{
static M var;
~A() { __builtin_puts("~A"); }
};

M A::var;
A   obj;

int main()
{
}

I get this output:

~A
~M

N.B. I had to change your int to a class type, because int has no
destructor anyway, so doesn't get destroyed. So your example does not
demonstrate the problem you're describing.

If the definitions of A::var and obj are in the same source file then
the destructors should be in the reverse order of their definitions.
If they are in separate source files, then their initialization order
is unspecified, and so the destruction order is also unspecified. If
you define them in the same file, it should work correctly.


Re: CREL relocation format for ELF (was: RELLEB)

2024-03-28 Thread Alan Modra via Gcc
On Fri, Mar 22, 2024 at 06:51:41PM -0700, Fangrui Song wrote:
> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
> > I propose RELLEB, a new format offering significant file size
> > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >
> > Your thoughts on RELLEB are welcome!

Does anyone really care about relocatable object file size?  If they
do, wouldn't they be better off using a compressed file system?

-- 
Alan Modra
Australia Development Lab, IBM


Re: CREL relocation format for ELF (was: RELLEB)

2024-03-28 Thread Fangrui Song via Gcc
On Thu, Mar 28, 2024 at 6:04 AM Alan Modra  wrote:
>
> On Fri, Mar 22, 2024 at 06:51:41PM -0700, Fangrui Song wrote:
> > On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
> > > I propose RELLEB, a new format offering significant file size
> > > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> > >
> > > Your thoughts on RELLEB are welcome!
>
> Does anyone really care about relocatable object file size?  If they
> do, wouldn't they be better off using a compressed file system?

Yes, many people care about relocatable file sizes.

* Relocation sizes affect DWARF evolution and we were/are using an
imperfect metric due to overly bloated REL/RELA. .debug_str_offsets
does not get much traction in GCC, probably partly because it needs
relocations. DWARF v5 introduced changes to keep relocations small.
Many are good on their own, but we need to be cautious of relocation
concerns causing us to pick the wrong trade-off in the future.
* On many Linux targets, Clang emits .llvm_addrsig by default to allow
ld.lld --icf=safe. .llvm_addrsig stores symbol indexes in ULEB128
instead of using relocations to prevent a significant size increase.
* Static relocations make .a files larger.
* Some users care about the build artifact size due to limited disk space.
  + I believe part of the reasons -ffunction-sections -fdata-sections
do not get more adoption is due to the relocatable file size concern.
  + I prefer to place build directories in Linux tmpfs. 12G vs 10G in
memory matters to me :)
  + Large .o files => more IO amount. This may be more significant
when the storage is remote.


gcc-11-20240328 is now available

2024-03-28 Thread GCC Administrator via Gcc
Snapshot gcc-11-20240328 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/11-20240328/
and on various mirrors, see https://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 11 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-11 revision c7e5c857e67694b5826b6e9c0a067491166ba7a0

You'll find:

 gcc-11-20240328.tar.xz   Complete GCC

  SHA256=297c90ed5547aab0992147138433ed450afc703e958379b499db1bf2d68bf41c
  SHA1=c6f0b24b5ce0a91614fda653cd057eecaae20194

Diffs from 11-20240321 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-11
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.