Re: Question about merging if-else blocks

2023-10-04 Thread Richard Biener via Gcc
On Sun, Oct 1, 2023 at 6:13 AM Hanke Zhang  wrote:
>
> Richard Biener  于2023年9月27日周三 15:30写道:
> >
> > On Wed, Sep 27, 2023 at 7:21 AM Hanke Zhang via Gcc  wrote:
> > >
> > > Thanks! I understand what you mean, then can I think that if the
> > > function here is not an external function, but a function visible to
> > > the compiler and the function doesn't modify `a`, then these two
> > > blocks can be merged?
> >
> > Yes.  The key transform you'd see before any of the merging is
> > CSE of the loads from 'a', then the rest is equivalent to the local
> > variable case.
> >
> > Richard.
>
> Hi, Richard
>
> I'm still a little confused about this.
>
> I want to change the default behavior of gcc. We know that printf
> won't change the value of 'a'. I'd like to let the compiler to get
> this information as well. How can I do that? Or which pass should I
> focus on?

GCC has a builtin for 'printf' so it can handle those specially in
builtins.cc:builtin_fnspec (see attr-fnspec.h for how the magic
strings are encoded).  If that's taught that printf cannot modify
global memory it should already work, note though ...

> By disassembling the exe file generated by icc, I found that icc will
> merge these two blocks with the example code below. So I think there
> maybe some ways to make it.

... glibc for example allows user-provided printf format callbacks so
printf might call back into the current TU and modify globals in such
callback.  That's a GNU extension to printf that ICC likely doesn't
support 
(https://www.gnu.org/software/libc/manual/html_node/Customizing-Printf.html),
so that we're currently not doing this is for correctness.

I'm not sure if this extension is much used or if it is maybe deprecated.

With LTO it _might_ be possible to check whether any of the functions
dealing with this (register_printf_function) is used.  Without LTO for
symbols with hidden visibility that do not escape the TU this analysis
can be done TU-local.

Richard.

> Thanks.
> Hanke Zhang.
>
> >
> > > Marc Glisse  于2023年9月27日周三 12:51写道:
> > > >
> > > > On Wed, 27 Sep 2023, Hanke Zhang via Gcc wrote:
> > > >
> > > > > Hi, I have recently been working on merging if-else statement blocks,
> > > > > and I found a rather bizarre phenomenon that I would like to ask
> > > > > about.
> > > > > A rough explanation is that for two consecutive if-else blocks, if
> > > > > their if statements are exactly the same, they should be merged, like
> > > > > the following program:
> > > > >
> > > > > int a = atoi(argv[1]);
> > > > > if (a) {
> > > > >  printf("if 1");
> > > > > } else {
> > > > >  printf("else 1");
> > > > > }
> > > > > if (a) {
> > > > >  printf("if 2");
> > > > > } else {
> > > > >  printf("else 2");
> > > > > }
> > > > >
> > > > > After using the -O3 -flto optimization option, it can be optimized as 
> > > > > follows:
> > > > >
> > > > > int a = atoi(argv[1]);
> > > > > if (a) {
> > > > >  printf("if 1");
> > > > >  printf("if 2");
> > > > > } else {
> > > > >  printf("else 1");
> > > > >  printf("else 2");
> > > > > }
> > > > >
> > > > > But `a` here is a local variable. If I declare a as a global variable,
> > > > > it cannot be optimized as above. I would like to ask why this is? And
> > > > > is there any solution?
> > > >
> > > > If 'a' is a global variable, how do you know 'printf' doesn't modify its
> > > > value? (you could know it for printf, but it really depends on the
> > > > function that is called)
> > > >
> > > > --
> > > > Marc Glisse


Re: contrib/reghunt documentation

2023-10-04 Thread Richard Biener via Gcc
On Sun, Oct 1, 2023 at 9:20 PM Thomas Koenig via Gcc  wrote:
>
> Hi,
>
> is there some sort of concise explanation of how to use the
> scripts in contrib/reghunt?  There is no real documentation
> for what is in the directory, specifically not how to invoke
> them, and which directory to invoke them from. I have also
> not been able to do run the examples from contrib/reghunt/examples,
> let alone my own regression search.

since this all predates the git move I wouldn't suggest to use these scripts.
Maybe we should simply remove all of contrib/reghunt?  CCing author.

Richard.


Re: Question about function splitting

2023-10-04 Thread Richard Biener via Gcc
On Mon, Oct 2, 2023 at 7:15 PM Hanke Zhang via Gcc  wrote:
>
> Martin Jambor  于2023年10月3日周二 00:34写道:
> >
> > Hello,
> >
> > On Mon, Oct 02 2023, Hanke Zhang via Gcc wrote:
> > > Hi, I have some questions about the strategy and behavior of function
> > > splitting in gcc, like the following code:
> > >
> > > int glob;
> > > void f() {
> > >   if (glob) {
> > > printf("short path\n");
> > > return;
> > >   }
> > >   // do lots of expensive things
> > >   // ...
> > > }
> > >
> > > I hope it can be broken down like below, so that the whole function
> > > can perhaps be inlined, which is more efficient.
> > >
> > > int glob;
> > > void f() {
> > >   if (glob) {
> > > printf("short path\n");
> > > return;
> > >   }
> > >   f_part();
> > > }
> > >
> > > void f_part() {
> > >   // do lots of expensive things
> > >   // ...
> > > }
> > >
> > >
> > > But on the contrary, gcc splits it like these, which not only does not
> > > bring any benefits, but may increase the time consumption, because the
> > > function call itself is a more resource-intensive thing.
> > >
> > > int glob;
> > > void f() {
> > >   if (glob) {
> > > f_part();
> > > return;
> > >   }
> > >   // do lots of expensive things
> > >   // ...
> > > }
> > >
> > > void f_part() {
> > >   printf("short path\n"); // just do this
> > > }
> > >
> > > Are there any options I can offer to gcc to change this behavior? Or
> > > do I need to make some changes in ipa-split.cc?
> >
> > I'd suggest you file a bug to Bugzilla with a specific example that is
> > mis-handled, then we can have a look and discuss what and why happens
> > and what can be done about it.
> >
> > Thanks,
> >
> > Martin
>
> Hi, thanks for your reply.
>
> I'm trying to create an account right now. And I put a copy of the
> example code here in case someone is interested.
>
> And I'm using gcc 12.3.0. When you complie the code below via 'gcc
> test.c -O3 -flto -fdump-tree-fnsplit', you will find a phenomenon that
> is consistent with what I described above in the gimple which is
> dumped from fnsplit.

I think fnsplit currently splits out _cold_ code, I suppose !opstatus
is predicted to be false most of the time.

It looks like your intent is to inline this very early check as

  if (!opstatus) { test_split_write_1 (..); } else { test_split_write_2 (..); }

to possibly elide that test?  I would guess that IPA-CP is supposed to
do this but eventually refuses to create a clone for this case since
it would be large.

Unfortunately function splitting doesn't run during IPA transforms,
but maybe IPA-CP can be teached how to avoid the expensive clone
by performing what IPA split does in the case a check in the entry
block which splits control flow can be optimized?

Richard.

> #include 
> #include 
>
> int opstatus;
> unsigned char *objcode = 0;
> unsigned long position = 0;
> char *globalfile;
>
> int test_split_write(char *file) {
>   FILE *fhd;
>
>   if (!opstatus) {
> // short path here
> printf("Object code generation not active! Forgot to call "
>"quantum_objcode_start?\n");
> return 1;
>   }
>
>   if (!file)
> file = globalfile;
>
>   fhd = fopen(file, "w");
>
>   if (fhd == 0)
> return -1;
>
>   fwrite(objcode, position, 1, fhd);
>
>   fclose(fhd);
>
>   int *arr = malloc(1000);
>   for (int i = 0; i < 1000; i++) {
> arr[i] = rand();
>   }
>
>   return 0;
> }
>
> // to avoid `test_split_write` inlining into main
> void __attribute__((noinline)) call() { test_split_write("./txt"); }
>
> int main() {
>   opstatus = rand();
>   objcode = malloc(100);
>   position = 0;
>   call();
>   return 0;
> }


Re: Function return value can't be infered when it's not inlined

2023-10-04 Thread Richard Biener via Gcc
On Tue, Oct 3, 2023 at 6:30 PM Hanke Zhang via Gcc  wrote:
>
> Hi, I'm a little confused about the behavior of gcc when the function
> is not inlined.
>
> Here is an example code:
>
> int __attribute__((noinline)) foo() {
> return 1;
> }
>
> int main() {
> if (foo()) {
> printf("foo() returned 1\n");
> } else {
> printf("foo() returned 0\n");
> }
> return 0;
> }
>
> After compiling this via `-O3 -flto`, the else block isn't been
> optimized and still exists.
>
> Even it's so obvious that the function will return '1', can't the
> compiler see that? Does gcc only get this information by inlining the
> function? Or is that what the gcc does?
>
> If so, how to make a change to let gcc get this information then?

I think IPA-CP would be doing this but the issue is that historically
'noinline' also disabled other IPA transforms and we've kept that
for backward compatibility even when we introduced the separate 'noipa'
attribute.

Richard.

>
> Thanks
> Hanke Zhang


Re: Function return value can't be infered when it's not inlined

2023-10-04 Thread Richard Biener via Gcc
On Wed, Oct 4, 2023 at 10:37 AM Richard Biener
 wrote:
>
> On Tue, Oct 3, 2023 at 6:30 PM Hanke Zhang via Gcc  wrote:
> >
> > Hi, I'm a little confused about the behavior of gcc when the function
> > is not inlined.
> >
> > Here is an example code:
> >
> > int __attribute__((noinline)) foo() {
> > return 1;
> > }
> >
> > int main() {
> > if (foo()) {
> > printf("foo() returned 1\n");
> > } else {
> > printf("foo() returned 0\n");
> > }
> > return 0;
> > }
> >
> > After compiling this via `-O3 -flto`, the else block isn't been
> > optimized and still exists.
> >
> > Even it's so obvious that the function will return '1', can't the
> > compiler see that? Does gcc only get this information by inlining the
> > function? Or is that what the gcc does?
> >
> > If so, how to make a change to let gcc get this information then?
>
> I think IPA-CP would be doing this but the issue is that historically
> 'noinline' also disabled other IPA transforms and we've kept that
> for backward compatibility even when we introduced the separate 'noipa'
> attribute.

Oh, and I forgot that IIRC neither IPA CP nor IPA SRA handle return
functions in a way exposing this to optimization passes (there's no
way to encode this in fnspec, we'd need some return value value-range
and record that and make VRP/ranger query it on calls).

Richard.

> Richard.
>
> >
> > Thanks
> > Hanke Zhang


Re: contrib/reghunt documentation

2023-10-04 Thread Jonathan Wakely via Gcc
On Wed, 4 Oct 2023 at 09:04, Richard Biener via Gcc  wrote:
>
> On Sun, Oct 1, 2023 at 9:20 PM Thomas Koenig via Gcc  wrote:
> >
> > Hi,
> >
> > is there some sort of concise explanation of how to use the
> > scripts in contrib/reghunt?  There is no real documentation
> > for what is in the directory, specifically not how to invoke
> > them, and which directory to invoke them from. I have also
> > not been able to do run the examples from contrib/reghunt/examples,
> > let alone my own regression search.
>
> since this all predates the git move I wouldn't suggest to use these scripts.
> Maybe we should simply remove all of contrib/reghunt?  CCing author.

I'm sure it could be adapted to use git bisect, but just using git
bisect directly seems much simpler. You usually need fewer than 10
lines of shell script to write a script for use with git bisect run.


Re: Question about merging if-else blocks

2023-10-04 Thread Florian Weimer via Gcc
* Richard Biener:

>> By disassembling the exe file generated by icc, I found that icc will
>> merge these two blocks with the example code below. So I think there
>> maybe some ways to make it.
>
> ... glibc for example allows user-provided printf format callbacks so
> printf might call back into the current TU and modify globals in such
> callback.  That's a GNU extension to printf that ICC likely doesn't
> support 
> (https://www.gnu.org/software/libc/manual/html_node/Customizing-Printf.html),
> so that we're currently not doing this is for correctness.
>
> I'm not sure if this extension is much used or if it is maybe
> deprecated.

There's also fopencookie, which is more widely available.  The GNU C
library supports assignment to stdout, so an fopencookie stream could be
the target of printf, also triggering callbacks.

But I'm not sure if callbacks updating global variables should prevent
GCC from treating printf et al. as leaf functions.

Thanks,
Florian



[no subject]

2023-10-04 Thread bandar Assaf via Gcc


‏‫من الـ iPhone الخاص 

Re: Function return value can't be infered when it's not inlined

2023-10-04 Thread Hanke Zhang via Gcc
Richard Biener  于2023年10月4日周三 16:43写道:
>
> On Wed, Oct 4, 2023 at 10:37 AM Richard Biener
>  wrote:
> >
> > On Tue, Oct 3, 2023 at 6:30 PM Hanke Zhang via Gcc  wrote:
> > >
> > > Hi, I'm a little confused about the behavior of gcc when the function
> > > is not inlined.
> > >
> > > Here is an example code:
> > >
> > > int __attribute__((noinline)) foo() {
> > > return 1;
> > > }
> > >
> > > int main() {
> > > if (foo()) {
> > > printf("foo() returned 1\n");
> > > } else {
> > > printf("foo() returned 0\n");
> > > }
> > > return 0;
> > > }
> > >
> > > After compiling this via `-O3 -flto`, the else block isn't been
> > > optimized and still exists.
> > >
> > > Even it's so obvious that the function will return '1', can't the
> > > compiler see that? Does gcc only get this information by inlining the
> > > function? Or is that what the gcc does?
> > >
> > > If so, how to make a change to let gcc get this information then?
> >
> > I think IPA-CP would be doing this but the issue is that historically
> > 'noinline' also disabled other IPA transforms and we've kept that
> > for backward compatibility even when we introduced the separate 'noipa'
> > attribute.

Thanks. The initial example is a function that uses va_args as
parameters. It cannot be inlined because of va_args, and then its
return value cannot be predicted like above. For example, the
following function:

int foo (int num, ...) {
va_list args;
va_start(args, num);
int a1 = va_arg(args, int);
int a2 = va_arg(args, int);
printf("a1 = %d, a2 = %d\n", a1, a2);
va_end(args);
return 1;
}

int main() {
if (foo(2, rand(), rand())) {
printf("foo() returned 1\n");
} else {
printf("foo() returned 0\n");
}
return 0;
}

Wouldn't such a function be optimized like 'noinline'?

>
> Oh, and I forgot that IIRC neither IPA CP nor IPA SRA handle return
> functions in a way exposing this to optimization passes (there's no
> way to encode this in fnspec, we'd need some return value value-range
> and record that and make VRP/ranger query it on calls).
>
> Richard.
>

Thanks. So, does that mean I have to let VRP/ranger be able to query
the return value so that the else block can be optimized out?

> > Richard.
> >
> > >
> > > Thanks
> > > Hanke Zhang


Re: Question about function splitting

2023-10-04 Thread Hanke Zhang via Gcc
But when I change the code 'opstatus = rand()' to 'opstatus = rand()
%2', the probability of opstatus being 0 should be 50%, but the result
remains the same, i.e. still split at that point.

And the specific information can be found in Bugzilla, the link is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111672

Richard Biener  于2023年10月4日周三 16:20写道:
>
> On Mon, Oct 2, 2023 at 7:15 PM Hanke Zhang via Gcc  wrote:
> >
> > Martin Jambor  于2023年10月3日周二 00:34写道:
> > >
> > > Hello,
> > >
> > > On Mon, Oct 02 2023, Hanke Zhang via Gcc wrote:
> > > > Hi, I have some questions about the strategy and behavior of function
> > > > splitting in gcc, like the following code:
> > > >
> > > > int glob;
> > > > void f() {
> > > >   if (glob) {
> > > > printf("short path\n");
> > > > return;
> > > >   }
> > > >   // do lots of expensive things
> > > >   // ...
> > > > }
> > > >
> > > > I hope it can be broken down like below, so that the whole function
> > > > can perhaps be inlined, which is more efficient.
> > > >
> > > > int glob;
> > > > void f() {
> > > >   if (glob) {
> > > > printf("short path\n");
> > > > return;
> > > >   }
> > > >   f_part();
> > > > }
> > > >
> > > > void f_part() {
> > > >   // do lots of expensive things
> > > >   // ...
> > > > }
> > > >
> > > >
> > > > But on the contrary, gcc splits it like these, which not only does not
> > > > bring any benefits, but may increase the time consumption, because the
> > > > function call itself is a more resource-intensive thing.
> > > >
> > > > int glob;
> > > > void f() {
> > > >   if (glob) {
> > > > f_part();
> > > > return;
> > > >   }
> > > >   // do lots of expensive things
> > > >   // ...
> > > > }
> > > >
> > > > void f_part() {
> > > >   printf("short path\n"); // just do this
> > > > }
> > > >
> > > > Are there any options I can offer to gcc to change this behavior? Or
> > > > do I need to make some changes in ipa-split.cc?
> > >
> > > I'd suggest you file a bug to Bugzilla with a specific example that is
> > > mis-handled, then we can have a look and discuss what and why happens
> > > and what can be done about it.
> > >
> > > Thanks,
> > >
> > > Martin
> >
> > Hi, thanks for your reply.
> >
> > I'm trying to create an account right now. And I put a copy of the
> > example code here in case someone is interested.
> >
> > And I'm using gcc 12.3.0. When you complie the code below via 'gcc
> > test.c -O3 -flto -fdump-tree-fnsplit', you will find a phenomenon that
> > is consistent with what I described above in the gimple which is
> > dumped from fnsplit.
>
> I think fnsplit currently splits out _cold_ code, I suppose !opstatus
> is predicted to be false most of the time.
>
> It looks like your intent is to inline this very early check as
>
>   if (!opstatus) { test_split_write_1 (..); } else { test_split_write_2 (..); 
> }
>
> to possibly elide that test?  I would guess that IPA-CP is supposed to
> do this but eventually refuses to create a clone for this case since
> it would be large.
>
> Unfortunately function splitting doesn't run during IPA transforms,
> but maybe IPA-CP can be teached how to avoid the expensive clone
> by performing what IPA split does in the case a check in the entry
> block which splits control flow can be optimized?
>
> Richard.
>
> > #include 
> > #include 
> >
> > int opstatus;
> > unsigned char *objcode = 0;
> > unsigned long position = 0;
> > char *globalfile;
> >
> > int test_split_write(char *file) {
> >   FILE *fhd;
> >
> >   if (!opstatus) {
> > // short path here
> > printf("Object code generation not active! Forgot to call "
> >"quantum_objcode_start?\n");
> > return 1;
> >   }
> >
> >   if (!file)
> > file = globalfile;
> >
> >   fhd = fopen(file, "w");
> >
> >   if (fhd == 0)
> > return -1;
> >
> >   fwrite(objcode, position, 1, fhd);
> >
> >   fclose(fhd);
> >
> >   int *arr = malloc(1000);
> >   for (int i = 0; i < 1000; i++) {
> > arr[i] = rand();
> >   }
> >
> >   return 0;
> > }
> >
> > // to avoid `test_split_write` inlining into main
> > void __attribute__((noinline)) call() { test_split_write("./txt"); }
> >
> > int main() {
> >   opstatus = rand();
> >   objcode = malloc(100);
> >   position = 0;
> >   call();
> >   return 0;
> > }


Re: Function return value can't be infered when it's not inlined

2023-10-04 Thread Richard Biener via Gcc



> Am 04.10.2023 um 16:16 schrieb Hanke Zhang :
> 
> Richard Biener  于2023年10月4日周三 16:43写道:
>> 
>>> On Wed, Oct 4, 2023 at 10:37 AM Richard Biener
>>>  wrote:
>>> 
>>> On Tue, Oct 3, 2023 at 6:30 PM Hanke Zhang via Gcc  wrote:
 
 Hi, I'm a little confused about the behavior of gcc when the function
 is not inlined.
 
 Here is an example code:
 
 int __attribute__((noinline)) foo() {
return 1;
 }
 
 int main() {
if (foo()) {
printf("foo() returned 1\n");
} else {
printf("foo() returned 0\n");
}
return 0;
 }
 
 After compiling this via `-O3 -flto`, the else block isn't been
 optimized and still exists.
 
 Even it's so obvious that the function will return '1', can't the
 compiler see that? Does gcc only get this information by inlining the
 function? Or is that what the gcc does?
 
 If so, how to make a change to let gcc get this information then?
>>> 
>>> I think IPA-CP would be doing this but the issue is that historically
>>> 'noinline' also disabled other IPA transforms and we've kept that
>>> for backward compatibility even when we introduced the separate 'noipa'
>>> attribute.
> 
> Thanks. The initial example is a function that uses va_args as
> parameters. It cannot be inlined because of va_args, and then its
> return value cannot be predicted like above. For example, the
> following function:
> 
> int foo (int num, ...) {
>va_list args;
>va_start(args, num);
>int a1 = va_arg(args, int);
>int a2 = va_arg(args, int);
>printf("a1 = %d, a2 = %d\n", a1, a2);
>va_end(args);
>return 1;
> }
> 
> int main() {
>if (foo(2, rand(), rand())) {
>printf("foo() returned 1\n");
>} else {
>printf("foo() returned 0\n");
>}
>return 0;
> }
> 
> Wouldn't such a function be optimized like 'noinline'?
> 
>> 
>> Oh, and I forgot that IIRC neither IPA CP nor IPA SRA handle return
>> functions in a way exposing this to optimization passes (there's no
>> way to encode this in fnspec, we'd need some return value value-range
>> and record that and make VRP/ranger query it on calls).
>> 
>> Richard.
>> 
> 
> Thanks. So, does that mean I have to let VRP/ranger be able to query
> the return value so that the else block can be optimized out?

Yes (and compute a return value range).

Richard 

>>> Richard.
>>> 
 
 Thanks
 Hanke Zhang


-Wint-conversion as errors seems doable for GCC 14

2023-10-04 Thread Florian Weimer via Gcc
I completed a Fedora rawhide rebuild with an instrumented GCC (~14,500
packages).  156 packages failed to build with a logged -Wint-conversion
error.  This number is much lower than what I expected, and I think we
should include -Wint-conversion in the GCC 14 changes.

My instrumentation isn't very good and has false positives due to
cascading errors, such as this example from Emacs:

conftest.c: In function 'main':
conftest.c:246:21: error: passing argument 1 of 'pthread_setname_np' makes 
integer from pointer without a cast
  246 | pthread_setname_np ("a");
  | ^~~
  | |
  | char *
In file included from conftest.c:242:
/usr/include/pthread.h:463:42: note: expected 'pthread_t' {aka 'long unsigned 
int'} but argument is of type 'char *'
  463 | extern int pthread_setname_np (pthread_t __target_thread, const char 
*__name)
  |~~^~~
conftest.c:246:1: error: too few arguments to function 'pthread_setname_np'
  246 | pthread_setname_np ("a");
  | ^~
/usr/include/pthread.h:463:12: note: declared here
  463 | extern int pthread_setname_np (pthread_t __target_thread, const char 
*__name)
  |^~

The error at column 21 is logged, even though it is harmless because the
compilation can never succeed because of the wrong argument count.

That's why I think the number of failing builds 156 is really quite low,
and we should be able to manage on the Fedora side.  I'll eventually do
another rebuild with better instrumentation (one that also handles
waivers for implicitly declared functions; currently we get errors from
those because their type is still treated as int internally; maybe I
should switch to error_mark_node for that).

Thanks,
Florian



Scaling -fmacro-prefix-map= to thousands entries

2023-10-04 Thread Sergei Trofimovich via Gcc
Hi gcc developers!

Tl;DR:

I would like to implement a scalable way to pass `-fmacro-prefix-map=`
for `NixOS` distribution to avoid leaking build-time paths generated by
`__FILE__` macros used by various libraries.

I need some guidance what path to take to be acceptable for `gcc`
upstream.

I have a few possible solutions and wonder what I should try to upstream
to GCC. The options I see:

1. Hardcode NixOS-specific way to mangle paths.

   Pros: simplest to implement, can be easily configured away if needed
   Cons: inflexible, `clang` might or might not accept the same hack

2. Extend `-fmacro-prefix-map=` (or add a new `-fmacro-prefix-map-file=`)
   to allow passing a file

   Pros: still not too hard to implement, generic enough to be used in
 other contexts.
   Cons: Will require client to construct the map file.

3. Have more flexible `-fmacro-prefix-map-regex=` option that allows
   patterns. Something like:

  
-fmacro-prefix-map-regex=/nix/store/[a-z0-9]{32}-=/nix/store/-

  Pros: at least for NixOS one option will be enough to cover all
packages as they all share above template.
  Cons: pulls some form of regex with it's can of worms including escape
delimiters, might not be flexible enough for other use cases.

4. Something else?

Which one(s) should I take to implement?

More words:

`NixOS` (and `nixpkgs` repository) install every software package into
an individual directory with unique prefix. Some examples:

/nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev
/nix/store/rb3q4kcyfg77cmkiwywx2aqdd3x5ch93-libmpc-1.3.1
/nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev
/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2
...

It's a fundamental design decision to allow parallel package installs.

From dependency tracking standpoint it's highly undesirable to have
these absolute paths to be hardcoded into final executable binaries if
they are not used at runtime.

Example redundant path we would like not to have in final binaries:

$ strings result/bin/nix | grep phjcmy025rd1ankw5y1b21xsdii83cyk

/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/json.hpp

/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/output/serializer.hpp

/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/conversions/to_chars.hpp

/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/lexer.hpp

/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iter_impl.hpp

/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/json_sax.hpp

/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iteration_proxy.hpp

/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/parser.hpp

Those paths are inserted via glibc's assert() uses of `__FILE__`
directive and thus hardcode header file paths from various packages
(like lttng-ust or nlohmann/json) into compiled binaries. Sometimes
`__FILE__` usage is mire creating than assert().

I would like to get rid of references to header files. I think
`-fmacro-prefix-map=` are ideal for this particular use case.

The prototype that creates equivalent of the following commands does
work for smaller packages:


-fmacro-prefix-map=/nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev=/nix/store/-glibc-2.37-8-dev

-fmacro-prefix-map=/nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev=/nix/store/-gmp-with-cxx-6.3.0-dev

-fmacro-prefix-map=/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2=/nix/store/-nlohmann_json-3.11.2
...

The above works for small amount of options (like, 100). But around 1000
options we start hitting linux limits on the single environment variable
or real-world packages like `qemu` with a ton of input depends.

The command-line limitations are in various places:
- `gcc` limitation of lifting all command line options into a single
  environment variable: https://gcc.gnu.org/PR111527
- `linux` limitation of constraining single environ variable to a value
  way below than full available environment space:
  https://lkml.org/lkml/2023/9/24/381

`linux` fix would buy us 50x more budged (A Lot) but it will not help
much other operating systems like `Darwin` where absolute environment
limit is a lot lower than `linux`.

I already implemented [1.] in https://github.com/NixOS/nixpkgs/pull/255192
(also attached `mangle-NIX_STORE-in-__FILE__.patch` 3.5K patch against
`master` as a proof of concept).

What would be the best way to scale up `-fmacro-prefix-map=` up to NixOS
needs for