https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121093

--- Comment #4 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
> in the end I'm not sure what's "wrong" here and why you think you are missing
p2 - p2 is not executed, you shouldn't get any profile on it.

Seems we kind of disagree on how "executed" is defined.
If you compile with -O0 then p2 is executed and you can breakpoint in it

(gdb) break p2
Breakpoint 1 at 0x40112c: file t.c, line 8.
(gdb) r
Starting program: /tmp/a.out 
Breakpoint 1, p2 (a=0) at t.c:8
8               return a+2;

In optimized binary both p1, p2 and part of p3 are executed as a single
instruction:

        .loc 1 14 9 view .LVU1
.LBB6:
.LBI6:
        .loc 1 2 12 view .LVU2
.LBB7:
        .loc 1 4 9 view .LVU3
        .loc 1 4 17 is_stmt 0 view .LVU4
        leal    3(%rdi), %eax
.LBE7:
.LBE6:

I believe that debug markers are designed to make debugging of optimized binary
closer to debugging of optimized binary in such situations and it seems
reasonable to expect that if I breakpoint in p2 it will trigger both in
optimized and unoptimized binary.

If you do

i++;
i++;
i++;

which is equivalent code but without putting things to random inlines, it will
work, since the debug statements will not be discarded.

I actually code the block removal code long time ago, but it was before debug
statements stuff.

AFDO needs kind of similar behaviour since it reads profile of optimized binary
and retrofits it to not yet fully optimized code and relies on debug info to
hold this together.

This is bit of an extreme example and it is easy to fix the issue at profile
read in. However, it is based on what happens in deepsjeng.  In C++ if you have
getter/setters and iterators for everything, often multiple calls get combined
and if we lose the locations we may end up losing info on loop headers that
confused hot/cold logic.

Reply via email to