Re: [lldb-dev] [llvm-dev] Why is lldb telling me "variable not available"?

2020-02-06 Thread Jeremy Morse via lldb-dev
Hi Brian,

Thanks for working on coroutines, the debugging experience, and in
particular thanks for the comprehensive write-up!,

On Thu, Feb 6, 2020 at 1:19 PM Brian Gesiak via llvm-dev
 wrote:
> Specifically, I’m trying to improve lldb’s behavior when showing
> variables in the current stack frame, when that frame corresponds to a
> coroutine function.

[...]

Everything in the IR appears correct to my eyes, although I know next
to nothing about coroutines and might have missed something. The
simplest explanation of why the variable location goes missing can be
seen in the disassembly:

> ```
> 0x401885 <+373>: movq   -0x8(%rbp), %rax
> 0x401889 <+377>: movl   $0x0, 0x40(%rax)
> 0x401890 <+384>: X movl   0x28(%rax), %edx
> 0x401893 <+387>: X addl   $0x1, %edx
> 0x401896 <+390>: X movl   %edx, 0x28(%rax)
> 0x401899 <+393>: X movl   0x40(%rax), %edx
> 0x40189c <+396>: addl   $0x1, %edx
> 0x40189f <+399>: movl   %edx, 0x40(%rax)
> ->  0x4018a2 <+402>: movl   0x28(%rax), %esi
> ```

Where I've marked with 'X' before the mnemonic the instructions that
the variable location list covers. The location of "i" is correctly
given as edx from its load to its store, and ends when edx is
overwritten with the value of "j". In all the rest of the code, the
variables value is in memory, and the DWARF data doesn't record this.

Ideally debug info would track variables when they're stored to memory
-- however we don't automatically know whether any subsequent store to
memory will overwrite that variable, and so we don't track locations
into memory. PR40628 [0] is an example of what can go wrong, where we
described a variable as being in memory, but didn't know when that
location was overwritten.

If whatever's producing the coroutine IR has guarantees about where
and when variables are loaded/stored from/to memory, it should be
possible to put more information into the IR, so that the rest of LLVM
doesn't have to guess. For example, this portion of IR:

  %15 = load i32, i32* %i.reload.addr62, align 4, !dbg !670
  call void @llvm.dbg.value(metadata i32 %15, metadata !659, metadata
!DIExpression()), !dbg !661
  %inc19 = add nsw i32 %15, 1, !dbg !670
  call void @llvm.dbg.value(metadata i32 %inc19, metadata !659,
metadata !DIExpression()), !dbg !661
  store i32 %inc19, i32* %i.reload.addr62, align 4, !dbg !670

Could have a call to llvm.dbg.addr(metadata i32 *%i.reload.addr66,
...) inserted after the store, indicating that the variable is located
in memory. This should work (TM) so long as that memory is never
overwritten with something that isn't the current value of "i" on
every path after the call to llvm.dbg.addr; and on every path after
the call to llvm.dbg.addr, when the variable is loaded form memory,
there's a call to llvm.dbg.value to indicate that the variable is
located somewhere other than memory now.

Providing that extra information should improve the location coverage
for your example, certainly when unoptimised. However, I believe (80%)
this method isn't safe against optimisation, because (for example)
dead stores can be deleted by LLVM passes without deleting the call to
llvm.dbg.addr, pointing the variable location at a stale value in
memory. Unfortunately I'm not aware of a facility or technique that
protects against this right now. (CC Reid who I think ran into this
last?).

Note that there's some support for tracking variables through stack
spills in post-isel debug data passes, however those loads and stores
operate in well defined ways, and general loads and stores might not.

[0] https://bugs.llvm.org/show_bug.cgi?id=40628

--
Thanks,
Jeremy
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] Why is lldb telling me "variable not available"?

2020-02-26 Thread Jeremy Morse via lldb-dev
Hi Brian,

On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak  wrote:
> In other words, the value of %i is stored on the frame object, on the
> heap, at an offset of 7 into the frame. I'm beginning to think a
> fundamental fix for this issue would be to stop replacing
> llvm.dbg.declare with llvm.dbg.value, and instead replace the
> llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i
> variable's new permanent location as an offset into the coroutine
> frame object. Does this approach make sense to people on this mailing
> list, who probably know more about how these intrinsics work than I
> do?

This matches a few similar use cases that I'm aware of -- certain
kinds of struct that are passed-by-value according to the language,
but passed-by-reference according to ABI, are treated in that way. In
general, the downside is that the debugger can only observe variable
values when they get written to memory, not when they're computed, as
dbg.values and dbg.declares aren't supposed to be mixed. Observing
variable values slightly later might be an improvement over the
current situation.

Although, I don't think this will work immediately, see below,

> I tried multiple approaches to manually inserting an llvm.dbg.addr
> after the store instruction, as per your suggestion, Jeremy. I used
> llc to compile the IR into an object file that I then linked, and
> inspected the DWARF generated for the file. Unfortunately, inserting
> dbg.addr that operated on the reloaded values didn't lead to any
> change in the DWARF that was produced --  specifically, this didn't
> make a difference:
>
> call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata
> !873, metadata !DIExpression()), !dbg !884

Ouch, I tried this myself, and ran into the same difficulty. I'd
missed that all your functions are marked "optnone" / -O0, which means
a different instruction-selection pass (FastISel) runs, and it turns
out FastISel isn't aware of dbg.addrs existence. Even better, FastISel
doesn't manage to lower any debug intrinsic (including dbg.declare)
that refers to a GEP, because it doesn't have a register location (the
GEP gets folded into a memory addressing mode).

I've hacked together some support in [0], that allows dbg.addr's of
GEPs to be handled. A single dbg.addr at the start of the function
(and no dbg.values) should get you the same behaviour as a
dbg.declare. I suspect the reason why this problem hasn't shown up in
the past is because the coroutine code being generated hits a gap
between "optimised" and "not optimised": I believe all variables in
code that isn't optimised get their own storage (and so will always
have a stack or register location). Wheras in the coroutine code
you're generating the variable address doesn't get storage.

If [0] is useful for you I can get that landed; it'd be good to hear
whether this resolves the dbg.addr intrinsics not having an affect on
the output.

[0] 
https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b

--
Thanks,
Jeremy
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev