https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121093
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Jan Hubicka from comment #4) > > in the end I'm not sure what's "wrong" here and why you think you are > > missing > p2 - p2 is not executed, you shouldn't get any profile on it. > > Seems we kind of disagree on how "executed" is defined. > If you compile with -O0 then p2 is executed and you can breakpoint in it > > (gdb) break p2 > Breakpoint 1 at 0x40112c: file t.c, line 8. > (gdb) r > Starting program: /tmp/a.out > Breakpoint 1, p2 (a=0) at t.c:8 > 8 return a+2; > > In optimized binary both p1, p2 and part of p3 are executed as a single > instruction: > > .loc 1 14 9 view .LVU1 > .LBB6: > .LBI6: > .loc 1 2 12 view .LVU2 > .LBB7: > .loc 1 4 9 view .LVU3 > .loc 1 4 17 is_stmt 0 view .LVU4 > leal 3(%rdi), %eax > .LBE7: > .LBE6: > > I believe that debug markers are designed to make debugging of optimized > binary closer to debugging of optimized binary in such situations and it > seems reasonable to expect that if I breakpoint in p2 it will trigger both > in optimized and unoptimized binary. > > If you do > > i++; > i++; > i++; > > which is equivalent code but without putting things to random inlines, it > will work, since the debug statements will not be discarded. > > I actually code the block removal code long time ago, but it was before > debug statements stuff. > > AFDO needs kind of similar behaviour since it reads profile of optimized > binary and retrofits it to not yet fully optimized code and relies on debug > info to hold this together. > > This is bit of an extreme example and it is easy to fix the issue at profile > read in. However, it is based on what happens in deepsjeng. In C++ if you > have getter/setters and iterators for everything, often multiple calls get > combined and if we lose the locations we may end up losing info on loop > headers that confused hot/cold logic. I think we scrapped such blocks to shrink what we stream to LTO and reduce the memory footprint and also to shrink useless debug info. As you say C++ is full of "empty" (early) inlined functions. At some point doing this was quite important. But we can of course re-consider for afdo, there's already strange afdo conditionals in the code.