[Dwarf-Discuss] Location list entries for caller-saved registers at time of call
Hi! When GDB and LLDB perform virtual unwinding, they subtract one byte from the return addresses of the outer frames. This is for example necessary when unwinding from a non-returning call that is placed last in the function, as the return address then can point to a different function. I assume that this is also necessary to get the variables that were in scope at the time of the call, and the right location expressions for the variables, etc. As far as I have understood it, GCC utilizes this fact for location list entries that are expressed in caller-saved registers, by subtracting one from the (exclusive) ending address of the entries. This means that variables in outer frames that are located in caller- saved registers will be printed out as by GDB. I have not been able to find anything in the DWARF standard that describes this. Is this something that is defined by standard, or is it established praxis between GCC and GDB? If the latter, do you know of other producers and consumers that behave like this? The reason why I ask this is because Clang/LLVM at the moment ends location list entries expressed in caller-saved registers at the first instruction after the call. This means that variables in outer frames using such location list entries will incorrectly be evaluated using the inner-most frame's register values when debugging in GDB. (As a side note, as far as I have understood it, LLDB has fallback knowledge of the ABI, so caller-saved registers will be considered unavailable in outer frames, meaning that such location list entries are not an issue when combining Clang/LLVM and LLDB.) Best regards, David ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Location list entries for caller-saved registers at time of call
On Thu, Dec 06 2018, David Stenberg via Dwarf-Discuss wrote: > [...] variables in outer frames using such location list entries will > incorrectly be evaluated using the inner-most frame's register values > when debugging in GDB. If GDB uses caller-saved register values from the inner-most frame in outer frames, then this is a bug. Note that this could also be caused by bad CFI. -- Andreas ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Location list entries for caller-saved registers at time of call
On tor, 2018-12-06 at 16:32 +0100, Andreas Arnez via Dwarf-Discuss wrote: > If GDB uses caller-saved register values from the inner-most frame in > outer frames, then this is a bug. Note that this could also be > caused > by bad CFI. Hmm, right. I'm not very familiar with the design philosophy of GDB, but as far as I have understood it they prefer to rely on producer to emit such information. In the LLVM bug report [0] I wrote for this I mentioned using DW_CFA_undefined for caller-saved registers, or at least for those that it knows are clobbered. Neither GCC nor Clang emits CFI for caller-saved registers at the moment. I have not been successful in finding out why, but my (uneducated) guess is that, maybe at least one of the reasons, is debug info size concerns? [0] https://bugs.llvm.org/show_bug.cgi?id=39752 Best regards, David ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Location list entries for caller-saved registers at time of call
On 12/06/2018 04:40 AM, David Stenberg via Dwarf-Discuss wrote: Hi! When GDB and LLDB perform virtual unwinding, they subtract one byte from the return addresses of the outer frames. This is for example necessary when unwinding from a non-returning call that is placed last in the function, as the return address then can point to a different function. I assume that this is also necessary to get the variables that were in scope at the time of the call, and the right location expressions for the variables, etc. As you mention, GDB is trying to insure that the computed return address is within the scope of the function, so that the function will be identified correctly. As far as I have understood it, GCC utilizes this fact for location list entries that are expressed in caller-saved registers, by subtracting one from the (exclusive) ending address of the entries. Similarly, the ending address of a LocList entry could be one beyond the end of a function, if the address is of a variable allocated at the end. Subtracting one insures that the correct function is identified. This means that variables in outer frames that are located in caller- saved registers will be printed out as by GDB Why? I have not been able to find anything in the DWARF standard that describes this. Is this something that is defined by standard, or is it established praxis between GCC and GDB? If the latter, do you know of other producers and consumers that behave like this? No, this is not part of the DWARF specification. This isn't any shared knowledge between GCC and GDB. Subtracting one from the end address to insure that the address is within the function is simply to accommodate the occasional situation where the call is the last instruction in the function. The reason why I ask this is because Clang/LLVM at the moment ends location list entries expressed in caller-saved registers at the first instruction after the call. This means that variables in outer frames using such location list entries will incorrectly be evaluated using the inner-most frame's register values when debugging in GDB. The DWARF definition of a LocList entry (Sec. 2.6.2) says that the "[t]he starting address is the lowest address of the address range over which the location is valid. The ending address is the address of the first location past the highest address of the address range." GDB frame handling is recursive and can be confusing. In most cases, GDB uses the next frame pointer as a handle for evaluating variables in the current frame. That does not mean that the evaluation is using the inner-frame registers. (From your other email): Hmm, right. I'm not very familiar with the design philosophy of GDB, but as far as I have understood it they prefer to rely on producer to emit such information. In the LLVM bug report [0] I wrote for this I mentioned using DW_CFA_undefined for caller-saved registers, or at least for those that it knows are clobbered. Why? Neither GCC nor Clang emits CFI for caller-saved registers at the moment. I have not been successful in finding out why, but my (uneducated) guess is that, maybe at least one of the reasons, is debug info size concerns? Exactly. Some (most?) producers do not encode ABI requirements in the CFI, since this would be duplicated in every CFI entry. DWARF philosophy is not to duplicate information which is defined by the ABI or other architecture definitions. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Location list entries for caller-saved registers at time of call
Comments are inline... On Thu, Dec 6, 2018 at 7:40 AM David Stenberg via Dwarf-Discuss < dwarf-discuss@lists.dwarfstd.org> wrote: > Hi! > > When GDB and LLDB perform virtual unwinding, they subtract one byte > from the return addresses of the outer frames. This is for example > necessary when unwinding from a non-returning call that is placed last > in the function, as the return address then can point to a different > function. I assume that this is also necessary to get the variables > that were in scope at the time of the call, and the right location > expressions for the variables, etc. > Yes, I am aware of this practice--indeed, it is mentioned in Section 6.4.4 of both the V4 and V5 standards. Another perfectly good solution is for the compiler to assure that the return PC is always in the right scope to begin with. All it takes is to include a (never executed) NOP following any non-returning CALL at the last address of the routine.Such calls are not common, plus many environments align the beginning of (any subsequent) functions anyway so padding bytes are likely to be available. As a result, such "extra" bytes are not going to be a space issue. As far as I have understood it, GCC utilizes this fact for location > list entries that are expressed in caller-saved registers, by > subtracting one from the (exclusive) ending address of the entries. > This means that variables in outer frames that are located in caller- > saved registers will be printed out as by GDB. > Here you lose me. Once you do what is described in your first paragraph, there seems no need to do anything special about location lists at all. You seem to be saying that GCC somehow mis-represents those ranges for some reason, but I don't follow how or why or under what circumstances? I have not been able to find anything in the DWARF standard that > describes this. Is this something that is defined by standard, or is it > established praxis between GCC and GDB? If the latter, do you know of > other producers and consumers that behave like this. > As mentioned, see Section 6.4.4 of both the V4 and V5 standards. GCC (or other producers) has nothing to do with it. GCC (or other producers) should just describe the program as it exists. Only the debugger needs to know how to avoid this quirk of no-return optimization. > The reason why I ask this is because Clang/LLVM at the moment ends > location list entries expressed in caller-saved registers at the first > instruction after the call. This means that variables in outer frames > using such location list entries will incorrectly be evaluated using > the inner-most frame's register values when debugging in GDB. > But if GDB does what you say in the first paragraph, this will not be a problem. I don't follow... > > (As a side note, as far as I have understood it, LLDB has fallback > knowledge of the ABI, so caller-saved registers will be considered > unavailable in outer frames, meaning that such location list entries > are not an issue when combining Clang/LLVM and LLDB.) > > Best regards, > David > ___ > Dwarf-Discuss mailing list > Dwarf-Discuss@lists.dwarfstd.org > http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org > ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Location list entries for caller-saved registers at time of call
> Another perfectly good solution is for the compiler to assure that the return > PC is always in the > right scope to begin with. All it takes is to include a (never executed) NOP > following any non-returning > CALL at the last address of the routine.Such calls are not common, plus many > environments align > the beginning of (any subsequent) functions anyway so padding bytes are > likely to be available. As a > result, such "extra" bytes are not going to be a space issue. In fact, the PA-RISC and Itanium calling conventions specifically require this. -cary ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Location list entries for caller-saved registers at time of call
On 12/06/2018 12:47 PM, Cary Coutant via Dwarf-Discuss wrote: Another perfectly good solution is for the compiler to assure that the return PC is always in the right scope to begin with. All it takes is to include a (never executed) NOP following any non-returning CALL at the last address of the routine.Such calls are not common, plus many environments align the beginning of (any subsequent) functions anyway so padding bytes are likely to be available. As a result, such "extra" bytes are not going to be a space issue. In fact, the PA-RISC and Itanium calling conventions specifically require this. Not all ABIs do this. Many allow the end of one function to be immediately followed by the start of the next function. Subtracting one from the return address is a trivial way to insure that the address is within the calling function and not in the next function. And it works on all architectures. In any case, this is the way that GDB has handled return addresses for n >= 2 decades. Whatever the OP's issue is, this is not related. -- Michael Eagerea...@eagerm.com 1960 Park Blvd., Palo Alto, CA 94306 ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Location list entries for caller-saved registers at time of call
> > In fact, the PA-RISC and Itanium calling conventions specifically require > > this. > > Not all ABIs do this. Many allow the end of one function to be > immediately followed by the start of the next function. We're talking about the narrow case of a function that ends with a no-return call. In all other situations, of course it's fine to have one function immediately follow the previous. This also applies to no-return calls in the interior of a function, where the next instruction might be part of a code region with different unwind descriptors. > Subtracting one from the return address is a trivial way to insure > that the address is within the calling function and not in the > next function. And it works on all architectures. It's not exactly trivial: it's not always correct to do this (e.g., when unwinding from a signal handler), and the unwinder needs to know how to recognize those kinds of frames. For PA-RISC and Itanium, we elected to make the rule simple: the pc value, whether it derives from a return pointer or a sigcontext, will always get you the correct unwind information. But we're getting sidetracked from the OP's question: Does GCC in fact subtract one from the upper bound of a location list entry for a variable contained in a caller-saved register? I can think of no reason why it should do this. If the compiled code does something like this: A: promote variable X into caller-save register R ... call f B: ... spill register R back to X C: ... The location list entry for x should show it live in register R in the range [A..C). The call should not interrupt the range. It sounds like David has an example where the range list entry shows x in register R in the range [A..B-1), then presumably again for [B..C), leaving [B-1..B) uncovered. That would be a bug, in my opinion. David, can you show us an example? -cary -cary ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org