https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121093

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #4)
> > in the end I'm not sure what's "wrong" here and why you think you are 
> > missing
> p2 - p2 is not executed, you shouldn't get any profile on it.
> 
> Seems we kind of disagree on how "executed" is defined.
> If you compile with -O0 then p2 is executed and you can breakpoint in it
> 
> (gdb) break p2
> Breakpoint 1 at 0x40112c: file t.c, line 8.
> (gdb) r
> Starting program: /tmp/a.out 
> Breakpoint 1, p2 (a=0) at t.c:8
> 8               return a+2;
> 
> In optimized binary both p1, p2 and part of p3 are executed as a single
> instruction:
> 
>         .loc 1 14 9 view .LVU1
> .LBB6:
> .LBI6:
>         .loc 1 2 12 view .LVU2
> .LBB7:
>         .loc 1 4 9 view .LVU3
>         .loc 1 4 17 is_stmt 0 view .LVU4
>         leal    3(%rdi), %eax
> .LBE7:
> .LBE6:
> 
> I believe that debug markers are designed to make debugging of optimized
> binary closer to debugging of optimized binary in such situations and it
> seems reasonable to expect that if I breakpoint in p2 it will trigger both
> in optimized and unoptimized binary.
> 
> If you do
> 
> i++;
> i++;
> i++;
> 
> which is equivalent code but without putting things to random inlines, it
> will work, since the debug statements will not be discarded.
> 
> I actually code the block removal code long time ago, but it was before
> debug statements stuff.
> 
> AFDO needs kind of similar behaviour since it reads profile of optimized
> binary and retrofits it to not yet fully optimized code and relies on debug
> info to hold this together.
> 
> This is bit of an extreme example and it is easy to fix the issue at profile
> read in. However, it is based on what happens in deepsjeng.  In C++ if you
> have getter/setters and iterators for everything, often multiple calls get
> combined and if we lose the locations we may end up losing info on loop
> headers that confused hot/cold logic.

I think we scrapped such blocks to shrink what we stream to LTO and reduce
the memory footprint and also to shrink useless debug info.  As you say
C++ is full of "empty" (early) inlined functions.  At some point doing this
was quite important.

But we can of course re-consider for afdo, there's already strange afdo
conditionals in the code.

Reply via email to