[lldb-dev] [Bug 47766] New: String formatter regex wrongly catching multi-dimensional types as well.
https://bugs.llvm.org/show_bug.cgi?id=47766 Bug ID: 47766 Summary: String formatter regex wrongly catching multi-dimensional types as well. Product: lldb Version: unspecified Hardware: PC OS: Linux Status: NEW Severity: enhancement Priority: P Component: All Bugs Assignee: lldb-dev@lists.llvm.org Reporter: chi...@raincode.com CC: jdevliegh...@apple.com, llvm-b...@lists.llvm.org For a simple C program, struct b { char i[2][4]; }; int main() { struct b z; z.i[0][0] = 'F'; z.i[0][1] = 'O'; z.i[0][2] = 'O'; z.i[0][3] = 0; z.i[1][0] = 'B'; z.i[1][1] = 'A'; z.i[1][2] = 'R'; z.i[1][3] = 0; return 0; } while printing variable z, the type summary finder finds the summary provider because it matches the regex "char [[0-9]+]" while looking for char [2][4] and failing while printing due to not knowing how to handle the datatype. (As per conversation on lldb-dev mail thread)The regex should be "char [[0-9]+]$" with string terminator anchor to start matching at end to avoid trying to print multi-dimensional data. LLDB currently, (lldb) target create "/home/chirag/a.out" Current executable set to '/home/chirag/a.out' (x86_64). (lldb) b 15 Breakpoint 1: where = a.out`main + 36 at struct.c:15, address = 0x00400511 (lldb) r Process 57100 launched: '/home/chirag/a.out' (x86_64) Process 57100 stopped * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x00400511 a.out`main at struct.c:15 12 z.i[1][1] = 'A'; 13 z.i[1][2] = 'R'; 14 z.i[1][3] = 0; -> 15 return 0; 16 } (lldb) p z (b) $0 = (i = char [2][4] @ 0x01fe3ca0) (lldb) q GDB while gdb prints as expected, Reading symbols from /home/chirag/a.out...done. (gdb) b 15 Breakpoint 1 at 0x400511: file Desktop/test/struct.c, line 15. (gdb) r Starting program: /home/chirag/./a.out Breakpoint 1, main () at Desktop/test/struct.c:15 15 return 0; Missing separate debuginfos, use: debuginfo-install glibc-2.17-307.el7.1.x86_64 (gdb) p z $1 = {i = {"FOO", "BAR"}} (gdb) q A debugging session is active. -- You are receiving this mail because: You are the assignee for the bug.___ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
[lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
Hello LLDB devs, This is a deep dive into an issue I found in the LLDB handling of DWARF call frame information, so no need to read further if this doesn't interest you! I am in the process of adding some support to LLVM for parsing the opcode state machines for CIE and FDE objects that produces unwind rows. While making unit tests to test DW_CFA_restore and DW_CFA_restore_extended opcodes, I read the DWARF spec that states: "The DW_CFA_restore instruction takes a single operand (encoded with the opcode) that represents a register number. The required action is to change the rule for the indicated register to the rule assigned it by the initial_instructions in the CIE." Looking at the LLDB code in DWARFCallFrameInfo.cpp I see code that is simplified to: case DW_CFA_restore: if (unwind_plan.IsValidRowIndex(0) && unwind_plan.GetRowAtIndex(0)->GetRegisterInfo(reg_num, reg_location)) row->SetRegisterInfo(reg_num, reg_location); break; The issue is, the CIE contains initial instructions, but it doesn't push a row after doing these instructions, the FDE will push a row when it emits a DW_CFA_advance_loc, DW_CFA_advance_loc1, DW_CFA_advance_loc2, DW_CFA_advance_loc4 or DW_CFA_set_loc opcode. So the DWARF spec says we should restore the register rule to be what it was in the CIE's initial instructions, but we are restoring it to the first row that was parsed. This will mostly not get us into trouble because .debug_frame and .eh_frame usually have a DW_CFA_advance_locXXX or DW_CFA_set_loc opcode as the first opcode, but it isn't a requirement and a FDE could modify a register value prior to pushing the first row at index zero. So we might be restoring the register incorrectly in some cases according to the spec. Also, what if there was no value specified in the CIE's initial instructions for a register? Should we remove the register value to match the state of the CIE's initial instructions if there is no rule for the register? We are currently leaving this register as having the same value if there is no value for the register in the first row. Let me know what you think. Greg Clayton ___ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
On 2020-10-08, Greg Clayton via lldb-dev wrote: Hello LLDB devs, This is a deep dive into an issue I found in the LLDB handling of DWARF call frame information, so no need to read further if this doesn't interest you! I am in the process of adding some support to LLVM for parsing the opcode state machines for CIE and FDE objects that produces unwind rows. While making unit tests to test DW_CFA_restore and DW_CFA_restore_extended opcodes, I read the DWARF spec that states: "The DW_CFA_restore instruction takes a single operand (encoded with the opcode) that represents a register number. The required action is to change the rule for the indicated register to the rule assigned it by the initial_instructions in the CIE." Looking at the LLDB code in DWARFCallFrameInfo.cpp I see code that is simplified to: case DW_CFA_restore: if (unwind_plan.IsValidRowIndex(0) && unwind_plan.GetRowAtIndex(0)->GetRegisterInfo(reg_num, reg_location)) row->SetRegisterInfo(reg_num, reg_location); break; The issue is, the CIE contains initial instructions, but it doesn't push a row after doing these instructions, the FDE will push a row when it emits a DW_CFA_advance_loc, DW_CFA_advance_loc1, DW_CFA_advance_loc2, DW_CFA_advance_loc4 or DW_CFA_set_loc opcode. So the DWARF spec says we should restore the register rule to be what it was in the CIE's initial instructions, but we are restoring it to the first row that was parsed. This will mostly not get us into trouble because .debug_frame and .eh_frame usually have a DW_CFA_advance_locXXX or DW_CFA_set_loc opcode as the first opcode, but it isn't a requirement and a FDE could modify a register value prior to pushing the first row at index zero. So we might be restoring the register incorrectly in some cases according to the spec. Also, what if there was no value specified in the CIE's initial instructions for a register? Should we remove the register value to match the state of the CIE's initial instructions if there is no rule for the register? We are currently leaving this register as having the same value if there is no value for the register in the first row. Let me know what you think. Greg Clayton Thanks for noticing this. This is indeed a bug. nongnu.org libunwind is correct: http://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=blob;f=src/dwarf/Gparser.c;h=da170d4b35fe3d5ac86a81bec8bb9a08939e010d;hb=HEAD#l523 llvm libunwind (DwarfParser.hpp) makes an approximation by only tracking the first instruction setting up a register (it works in practice because initial instructions don't use multiple setup instructions on one register) ___ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
Good bug find! It seems to me that DWARFCallFrameInfo should push the initial CIE register setup instructions as being the state at offset 0 in the function (in fact I'd say it's a bug if it's not). If that were done, then getting RowAtIndex(0) should be synonymous with get-the-CIE-register-unwind-rules, and this code would be correct. Looking at DWARFCallFrameInfo::FDEToUnwindPlan, we do unwind_plan.SetPlanValidAddressRange(range); UnwindPlan::Row *cie_initial_row = new UnwindPlan::Row; *cie_initial_row = cie->initial_row; UnwindPlan::RowSP row(cie_initial_row); unwind_plan.SetRegisterKind(GetRegisterKind()); unwind_plan.SetReturnAddressRegister(cie->return_addr_reg_num); cie->initial_row is set by DWARFCallFrameInfo::HandleCommonDwarfOpcode. I think the bug here is DWARFCallFrameInfo::FDEToUnwindPlan not pushing that initial row at offset 0, isn't it? We don't really use DWARF CFI on darwin any more so I don't have a lot of real world experience here. > On Oct 8, 2020, at 4:01 PM, Greg Clayton wrote: > > Hello LLDB devs, > > This is a deep dive into an issue I found in the LLDB handling of DWARF call > frame information, so no need to read further if this doesn't interest you! > > I am in the process of adding some support to LLVM for parsing the opcode > state machines for CIE and FDE objects that produces unwind rows. While > making unit tests to test DW_CFA_restore and DW_CFA_restore_extended opcodes, > I read the DWARF spec that states: > > "The DW_CFA_restore instruction takes a single operand (encoded with the > opcode) that represents a register number. The required action is to change > the rule for the indicated register to the rule assigned it by the > initial_instructions in the CIE." > > Looking at the LLDB code in DWARFCallFrameInfo.cpp I see code that is > simplified to: > > case DW_CFA_restore: > if (unwind_plan.IsValidRowIndex(0) && > unwind_plan.GetRowAtIndex(0)->GetRegisterInfo(reg_num, reg_location)) > row->SetRegisterInfo(reg_num, reg_location); > break; > > > The issue is, the CIE contains initial instructions, but it doesn't push a > row after doing these instructions, the FDE will push a row when it emits a > DW_CFA_advance_loc, DW_CFA_advance_loc1, DW_CFA_advance_loc2, > DW_CFA_advance_loc4 or DW_CFA_set_loc opcode. So the DWARF spec says we > should restore the register rule to be what it was in the CIE's initial > instructions, but we are restoring it to the first row that was parsed. This > will mostly not get us into trouble because .debug_frame and .eh_frame > usually have a DW_CFA_advance_locXXX or DW_CFA_set_loc opcode as the first > opcode, but it isn't a requirement and a FDE could modify a register value > prior to pushing the first row at index zero. So we might be restoring the > register incorrectly in some cases according to the spec. Also, what if there > was no value specified in the CIE's initial instructions for a register? > Should we remove the register value to match the state of the CIE's initial > instructions if there is no rule for the register? We are currently leaving > this register as having the same value if there is no value for the register > in the first row. > > Let me know what you think. > > Greg Clayton > ___ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
> On Oct 8, 2020, at 8:55 PM, Jason Molenda wrote: > > Good bug find! > > It seems to me that DWARFCallFrameInfo should push the initial CIE register > setup instructions as being the state at offset 0 in the function (in fact > I'd say it's a bug if it's not). If that were done, then getting > RowAtIndex(0) should be synonymous with get-the-CIE-register-unwind-rules, > and this code would be correct. > > Looking at DWARFCallFrameInfo::FDEToUnwindPlan, we do > > unwind_plan.SetPlanValidAddressRange(range); > UnwindPlan::Row *cie_initial_row = new UnwindPlan::Row; > *cie_initial_row = cie->initial_row; > UnwindPlan::RowSP row(cie_initial_row); > > unwind_plan.SetRegisterKind(GetRegisterKind()); > unwind_plan.SetReturnAddressRegister(cie->return_addr_reg_num); > > cie->initial_row is set by DWARFCallFrameInfo::HandleCommonDwarfOpcode. > > I think the bug here is DWARFCallFrameInfo::FDEToUnwindPlan not pushing that > initial row at offset 0, isn't it? We don't really use DWARF CFI on darwin > any more so I don't have a lot of real world experience here. The only opcodes that push a row are DW_CFA_advance_locXXX and DW_CFA_set_loc, so I don't think that is the right fix. I think we need to pass a copy of just the registers from the "cie->initial_row" object around to the opcode parsing code for restoration purposes. > > > >> On Oct 8, 2020, at 4:01 PM, Greg Clayton wrote: >> >> Hello LLDB devs, >> >> This is a deep dive into an issue I found in the LLDB handling of DWARF call >> frame information, so no need to read further if this doesn't interest you! >> >> I am in the process of adding some support to LLVM for parsing the opcode >> state machines for CIE and FDE objects that produces unwind rows. While >> making unit tests to test DW_CFA_restore and DW_CFA_restore_extended >> opcodes, I read the DWARF spec that states: >> >> "The DW_CFA_restore instruction takes a single operand (encoded with the >> opcode) that represents a register number. The required action is to change >> the rule for the indicated register to the rule assigned it by the >> initial_instructions in the CIE." >> >> Looking at the LLDB code in DWARFCallFrameInfo.cpp I see code that is >> simplified to: >> >> case DW_CFA_restore: >> if (unwind_plan.IsValidRowIndex(0) && >> unwind_plan.GetRowAtIndex(0)->GetRegisterInfo(reg_num, reg_location)) >> row->SetRegisterInfo(reg_num, reg_location); >> break; >> >> >> The issue is, the CIE contains initial instructions, but it doesn't push a >> row after doing these instructions, the FDE will push a row when it emits a >> DW_CFA_advance_loc, DW_CFA_advance_loc1, DW_CFA_advance_loc2, >> DW_CFA_advance_loc4 or DW_CFA_set_loc opcode. So the DWARF spec says we >> should restore the register rule to be what it was in the CIE's initial >> instructions, but we are restoring it to the first row that was parsed. This >> will mostly not get us into trouble because .debug_frame and .eh_frame >> usually have a DW_CFA_advance_locXXX or DW_CFA_set_loc opcode as the first >> opcode, but it isn't a requirement and a FDE could modify a register value >> prior to pushing the first row at index zero. So we might be restoring the >> register incorrectly in some cases according to the spec. Also, what if >> there was no value specified in the CIE's initial instructions for a >> register? Should we remove the register value to match the state of the >> CIE's initial instructions if there is no rule for the register? We are >> currently leaving this register as having the same value if there is no >> value for the register in the first row. >> >> Let me know what you think. >> >> Greg Clayton >> > ___ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
> On Oct 8, 2020, at 9:17 PM, Greg Clayton wrote: > > > >> On Oct 8, 2020, at 8:55 PM, Jason Molenda wrote: >> >> Good bug find! >> >> It seems to me that DWARFCallFrameInfo should push the initial CIE register >> setup instructions as being the state at offset 0 in the function (in fact >> I'd say it's a bug if it's not). If that were done, then getting >> RowAtIndex(0) should be synonymous with get-the-CIE-register-unwind-rules, >> and this code would be correct. >> >> Looking at DWARFCallFrameInfo::FDEToUnwindPlan, we do >> >> unwind_plan.SetPlanValidAddressRange(range); >> UnwindPlan::Row *cie_initial_row = new UnwindPlan::Row; >> *cie_initial_row = cie->initial_row; >> UnwindPlan::RowSP row(cie_initial_row); >> >> unwind_plan.SetRegisterKind(GetRegisterKind()); >> unwind_plan.SetReturnAddressRegister(cie->return_addr_reg_num); >> >> cie->initial_row is set by DWARFCallFrameInfo::HandleCommonDwarfOpcode. >> >> I think the bug here is DWARFCallFrameInfo::FDEToUnwindPlan not pushing that >> initial row at offset 0, isn't it? We don't really use DWARF CFI on darwin >> any more so I don't have a lot of real world experience here. > > The only opcodes that push a row are DW_CFA_advance_locXXX and > DW_CFA_set_loc, so I don't think that is the right fix. I think we need to > pass a copy of just the registers from the "cie->initial_row" object around > to the opcode parsing code for restoration purposes. The Rows in an UnwindPlan at each function offset describes the register unwind rules there. If the CIE has a register unwind rule, we should push that Row for offset 0 in the function. The function today may only push a Row when the pc has been advanced, but that is not (IMO) correct. I can't think of any register rule like this on arm64 or x86_64. You could say the CFA is $sp+8 at function entry, and the caller's $sp value is CFA+0 (aka $sp+8) to reflect the return-pc value that was pushed by the CALLQ. But I don't know if actual CFI on x86_64 includes an SP unwind rule in the CIE like this. > > >> >> >> >>> On Oct 8, 2020, at 4:01 PM, Greg Clayton wrote: >>> >>> Hello LLDB devs, >>> >>> This is a deep dive into an issue I found in the LLDB handling of DWARF >>> call frame information, so no need to read further if this doesn't interest >>> you! >>> >>> I am in the process of adding some support to LLVM for parsing the opcode >>> state machines for CIE and FDE objects that produces unwind rows. While >>> making unit tests to test DW_CFA_restore and DW_CFA_restore_extended >>> opcodes, I read the DWARF spec that states: >>> >>> "The DW_CFA_restore instruction takes a single operand (encoded with the >>> opcode) that represents a register number. The required action is to change >>> the rule for the indicated register to the rule assigned it by the >>> initial_instructions in the CIE." >>> >>> Looking at the LLDB code in DWARFCallFrameInfo.cpp I see code that is >>> simplified to: >>> >>> case DW_CFA_restore: >>> if (unwind_plan.IsValidRowIndex(0) && >>>unwind_plan.GetRowAtIndex(0)->GetRegisterInfo(reg_num, reg_location)) >>>row->SetRegisterInfo(reg_num, reg_location); >>> break; >>> >>> >>> The issue is, the CIE contains initial instructions, but it doesn't push a >>> row after doing these instructions, the FDE will push a row when it emits a >>> DW_CFA_advance_loc, DW_CFA_advance_loc1, DW_CFA_advance_loc2, >>> DW_CFA_advance_loc4 or DW_CFA_set_loc opcode. So the DWARF spec says we >>> should restore the register rule to be what it was in the CIE's initial >>> instructions, but we are restoring it to the first row that was parsed. >>> This will mostly not get us into trouble because .debug_frame and .eh_frame >>> usually have a DW_CFA_advance_locXXX or DW_CFA_set_loc opcode as the first >>> opcode, but it isn't a requirement and a FDE could modify a register value >>> prior to pushing the first row at index zero. So we might be restoring the >>> register incorrectly in some cases according to the spec. Also, what if >>> there was no value specified in the CIE's initial instructions for a >>> register? Should we remove the register value to match the state of the >>> CIE's initial instructions if there is no rule for the register? We are >>> currently leaving this register as having the same value if there is no >>> value for the register in the first row. >>> >>> Let me know what you think. >>> >>> Greg Clayton >>> >> > ___ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
> On Oct 8, 2020, at 9:17 PM, Greg Clayton wrote: > > > >> On Oct 8, 2020, at 8:55 PM, Jason Molenda wrote: >> >> Good bug find! >> >> It seems to me that DWARFCallFrameInfo should push the initial CIE register >> setup instructions as being the state at offset 0 in the function (in fact >> I'd say it's a bug if it's not). If that were done, then getting >> RowAtIndex(0) should be synonymous with get-the-CIE-register-unwind-rules, >> and this code would be correct. >> >> Looking at DWARFCallFrameInfo::FDEToUnwindPlan, we do >> >> unwind_plan.SetPlanValidAddressRange(range); >> UnwindPlan::Row *cie_initial_row = new UnwindPlan::Row; >> *cie_initial_row = cie->initial_row; >> UnwindPlan::RowSP row(cie_initial_row); >> >> unwind_plan.SetRegisterKind(GetRegisterKind()); >> unwind_plan.SetReturnAddressRegister(cie->return_addr_reg_num); >> >> cie->initial_row is set by DWARFCallFrameInfo::HandleCommonDwarfOpcode. >> >> I think the bug here is DWARFCallFrameInfo::FDEToUnwindPlan not pushing that >> initial row at offset 0, isn't it? We don't really use DWARF CFI on darwin >> any more so I don't have a lot of real world experience here. > > The only opcodes that push a row are DW_CFA_advance_locXXX and > DW_CFA_set_loc, so I don't think that is the right fix. I think we need to > pass a copy of just the registers from the "cie->initial_row" object around > to the opcode parsing code for restoration purposes. I think everything I'm saying here is besides the point, though. Unless we ALWAYS push the initial unwind state (from the CIE) to an UnwindPlan, the DW_CFA_restore is not going to work. If an unwind rule for r12 says "DW_CFA_restore" and at offset 0 in the function, the unwind rule for r12 was "same" (i.e. no rule), but we return the RowAtIndex(0) and the first instruction, I don't know, spills it or something, then the DW_CFA_restore would set the r12 rule to "r12 was spilled" instead of "r12 is same". So the only way DW_CFA_restore would behave correctly, with this, is if we always push a Row at offset 0 with the rules from the CIE, or with no rules at all, just the initial unwind state showing how the CFA is set and no register rules. > > >> >> >> >>> On Oct 8, 2020, at 4:01 PM, Greg Clayton wrote: >>> >>> Hello LLDB devs, >>> >>> This is a deep dive into an issue I found in the LLDB handling of DWARF >>> call frame information, so no need to read further if this doesn't interest >>> you! >>> >>> I am in the process of adding some support to LLVM for parsing the opcode >>> state machines for CIE and FDE objects that produces unwind rows. While >>> making unit tests to test DW_CFA_restore and DW_CFA_restore_extended >>> opcodes, I read the DWARF spec that states: >>> >>> "The DW_CFA_restore instruction takes a single operand (encoded with the >>> opcode) that represents a register number. The required action is to change >>> the rule for the indicated register to the rule assigned it by the >>> initial_instructions in the CIE." >>> >>> Looking at the LLDB code in DWARFCallFrameInfo.cpp I see code that is >>> simplified to: >>> >>> case DW_CFA_restore: >>> if (unwind_plan.IsValidRowIndex(0) && >>>unwind_plan.GetRowAtIndex(0)->GetRegisterInfo(reg_num, reg_location)) >>>row->SetRegisterInfo(reg_num, reg_location); >>> break; >>> >>> >>> The issue is, the CIE contains initial instructions, but it doesn't push a >>> row after doing these instructions, the FDE will push a row when it emits a >>> DW_CFA_advance_loc, DW_CFA_advance_loc1, DW_CFA_advance_loc2, >>> DW_CFA_advance_loc4 or DW_CFA_set_loc opcode. So the DWARF spec says we >>> should restore the register rule to be what it was in the CIE's initial >>> instructions, but we are restoring it to the first row that was parsed. >>> This will mostly not get us into trouble because .debug_frame and .eh_frame >>> usually have a DW_CFA_advance_locXXX or DW_CFA_set_loc opcode as the first >>> opcode, but it isn't a requirement and a FDE could modify a register value >>> prior to pushing the first row at index zero. So we might be restoring the >>> register incorrectly in some cases according to the spec. Also, what if >>> there was no value specified in the CIE's initial instructions for a >>> register? Should we remove the register value to match the state of the >>> CIE's initial instructions if there is no rule for the register? We are >>> currently leaving this register as having the same value if there is no >>> value for the register in the first row. >>> >>> Let me know what you think. >>> >>> Greg Clayton >>> >> > ___ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
> On Oct 8, 2020, at 10:06 PM, Jason Molenda wrote: > > > >> On Oct 8, 2020, at 9:17 PM, Greg Clayton wrote: >> >> >> >>> On Oct 8, 2020, at 8:55 PM, Jason Molenda wrote: >>> >>> Good bug find! >>> >>> It seems to me that DWARFCallFrameInfo should push the initial CIE register >>> setup instructions as being the state at offset 0 in the function (in fact >>> I'd say it's a bug if it's not). If that were done, then getting >>> RowAtIndex(0) should be synonymous with get-the-CIE-register-unwind-rules, >>> and this code would be correct. >>> >>> Looking at DWARFCallFrameInfo::FDEToUnwindPlan, we do >>> >>> unwind_plan.SetPlanValidAddressRange(range); >>> UnwindPlan::Row *cie_initial_row = new UnwindPlan::Row; >>> *cie_initial_row = cie->initial_row; >>> UnwindPlan::RowSP row(cie_initial_row); >>> >>> unwind_plan.SetRegisterKind(GetRegisterKind()); >>> unwind_plan.SetReturnAddressRegister(cie->return_addr_reg_num); >>> >>> cie->initial_row is set by DWARFCallFrameInfo::HandleCommonDwarfOpcode. >>> >>> I think the bug here is DWARFCallFrameInfo::FDEToUnwindPlan not pushing >>> that initial row at offset 0, isn't it? We don't really use DWARF CFI on >>> darwin any more so I don't have a lot of real world experience here. >> >> The only opcodes that push a row are DW_CFA_advance_locXXX and >> DW_CFA_set_loc, so I don't think that is the right fix. I think we need to >> pass a copy of just the registers from the "cie->initial_row" object around >> to the opcode parsing code for restoration purposes. > > > I think everything I'm saying here is besides the point, though. Unless we > ALWAYS push the initial unwind state (from the CIE) to an UnwindPlan, the > DW_CFA_restore is not going to work. If an unwind rule for r12 says > "DW_CFA_restore" and at offset 0 in the function, the unwind rule for r12 was > "same" (i.e. no rule), but we return the RowAtIndex(0) and the first > instruction, I don't know, spills it or something, then the DW_CFA_restore > would set the r12 rule to "r12 was spilled" instead of "r12 is same". > > So the only way DW_CFA_restore would behave correctly, with this, is if we > always push a Row at offset 0 with the rules from the CIE, or with no rules > at all, just the initial unwind state showing how the CFA is set and no > register rules. just to be clear, though, my initial reaction to this bug is "we should always push a row at offset 0." I don't want to sound dumb or anything, but I don't understand how unwinding would work if we didn't have a Row at offset 0. You step into the function, you're at the first instruction, you want to find the caller stack frame, and without knowing the rule for establishing the CFA and finding the saved pc, I don't know how you get that. And the only way to get the CFA / saved pc rule is to get the Row. Do we really not have a Row at offset 0 when an UnwindPlan is created from CFI? I might be forgetting some trick of UnwindPlans, but I don't see how it works. > > >> >> >>> >>> >>> On Oct 8, 2020, at 4:01 PM, Greg Clayton wrote: Hello LLDB devs, This is a deep dive into an issue I found in the LLDB handling of DWARF call frame information, so no need to read further if this doesn't interest you! I am in the process of adding some support to LLVM for parsing the opcode state machines for CIE and FDE objects that produces unwind rows. While making unit tests to test DW_CFA_restore and DW_CFA_restore_extended opcodes, I read the DWARF spec that states: "The DW_CFA_restore instruction takes a single operand (encoded with the opcode) that represents a register number. The required action is to change the rule for the indicated register to the rule assigned it by the initial_instructions in the CIE." Looking at the LLDB code in DWARFCallFrameInfo.cpp I see code that is simplified to: case DW_CFA_restore: if (unwind_plan.IsValidRowIndex(0) && unwind_plan.GetRowAtIndex(0)->GetRegisterInfo(reg_num, reg_location)) row->SetRegisterInfo(reg_num, reg_location); break; The issue is, the CIE contains initial instructions, but it doesn't push a row after doing these instructions, the FDE will push a row when it emits a DW_CFA_advance_loc, DW_CFA_advance_loc1, DW_CFA_advance_loc2, DW_CFA_advance_loc4 or DW_CFA_set_loc opcode. So the DWARF spec says we should restore the register rule to be what it was in the CIE's initial instructions, but we are restoring it to the first row that was parsed. This will mostly not get us into trouble because .debug_frame and .eh_frame usually have a DW_CFA_advance_locXXX or DW_CFA_set_loc opcode as the first opcode, but it isn't a requirement and a FDE could modify a register value prior to pushing the first row at index zero. So we might be restorin
Re: [lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
> On Oct 8, 2020, at 10:29 PM, Jason Molenda wrote: > > > >> On Oct 8, 2020, at 10:06 PM, Jason Molenda wrote: >> >> >> >>> On Oct 8, 2020, at 9:17 PM, Greg Clayton wrote: >>> >>> >>> On Oct 8, 2020, at 8:55 PM, Jason Molenda wrote: Good bug find! It seems to me that DWARFCallFrameInfo should push the initial CIE register setup instructions as being the state at offset 0 in the function (in fact I'd say it's a bug if it's not). If that were done, then getting RowAtIndex(0) should be synonymous with get-the-CIE-register-unwind-rules, and this code would be correct. Looking at DWARFCallFrameInfo::FDEToUnwindPlan, we do unwind_plan.SetPlanValidAddressRange(range); UnwindPlan::Row *cie_initial_row = new UnwindPlan::Row; *cie_initial_row = cie->initial_row; UnwindPlan::RowSP row(cie_initial_row); unwind_plan.SetRegisterKind(GetRegisterKind()); unwind_plan.SetReturnAddressRegister(cie->return_addr_reg_num); cie->initial_row is set by DWARFCallFrameInfo::HandleCommonDwarfOpcode. I think the bug here is DWARFCallFrameInfo::FDEToUnwindPlan not pushing that initial row at offset 0, isn't it? We don't really use DWARF CFI on darwin any more so I don't have a lot of real world experience here. >>> >>> The only opcodes that push a row are DW_CFA_advance_locXXX and >>> DW_CFA_set_loc, so I don't think that is the right fix. I think we need to >>> pass a copy of just the registers from the "cie->initial_row" object around >>> to the opcode parsing code for restoration purposes. >> >> >> I think everything I'm saying here is besides the point, though. Unless we >> ALWAYS push the initial unwind state (from the CIE) to an UnwindPlan, the >> DW_CFA_restore is not going to work. If an unwind rule for r12 says >> "DW_CFA_restore" and at offset 0 in the function, the unwind rule for r12 >> was "same" (i.e. no rule), but we return the RowAtIndex(0) and the first >> instruction, I don't know, spills it or something, then the DW_CFA_restore >> would set the r12 rule to "r12 was spilled" instead of "r12 is same". >> >> So the only way DW_CFA_restore would behave correctly, with this, is if we >> always push a Row at offset 0 with the rules from the CIE, or with no rules >> at all, just the initial unwind state showing how the CFA is set and no >> register rules. > > > > just to be clear, though, my initial reaction to this bug is "we should > always push a row at offset 0." I don't want to sound dumb or anything, but > I don't understand how unwinding would work if we didn't have a Row at offset > 0. You step into the function, you're at the first instruction, you want to > find the caller stack frame, and without knowing the rule for establishing > the CFA and finding the saved pc, I don't know how you get that. And the > only way to get the CFA / saved pc rule is to get the Row. Do we really not > have a Row at offset 0 when an UnwindPlan is created from CFI? I might be > forgetting some trick of UnwindPlans, but I don't see how it works. What you are saying makes sense, but that isn't how it is encoded. A quick example: 0010 CIE Version: 4 Augmentation: "" Address size: 4 Segment desc size: 0 Code alignment factor: 1 Data alignment factor: -4 Return address column: 14 DW_CFA_def_cfa: reg13 +0 DW_CFA_nop: DW_CFA_nop: 0014 0024 FDE cie= pc=0001bb2c...0001bc90 DW_CFA_advance_loc: 4 DW_CFA_def_cfa_offset: +32 DW_CFA_offset: reg14 -4 DW_CFA_offset: reg10 -8 DW_CFA_offset: reg9 -12 DW_CFA_offset: reg8 -16 DW_CFA_offset: reg7 -20 DW_CFA_offset: reg6 -24 DW_CFA_offset: reg5 -28 DW_CFA_offset: reg4 -32 DW_CFA_advance_loc: 2 DW_CFA_def_cfa_offset: +112 DW_CFA_nop: DW_CFA_nop: DW_CFA_advance_loc is what pushes a row. As you can see in the FDE, it is the first thing it does. If we were to always push a row after parsing the CIE instructions, then we would end up with two rows at the address for the FDE. The reason we need the FDE is to fill in a valid address for the CIE instructions into the row _before_ pushing it. So the CIE instructions don't make any sense without the FDE making it make sense. So as I originally stated: it probably won't affect us since this almost always happens, but if I have learned anything from years of parsing info compilers emit: if they can do it, they will. So the FDE _could_ modify more register values on top of the CIE instructions before pushing a row... > > >> >> >>> >>> > On Oct 8, 2020, at 4:01 PM, Greg Clayton wrote: > > Hello LLDB devs, > > This is a deep dive into an issue I found in the LLDB handling of DWARF > call frame information, so no need to read further if this doesn't > interest you! > > I am i
Re: [lldb-dev] LLDB might not be handling DW_CFA_restore or DW_CFA_restore_extended correctly in all cases
> On Oct 8, 2020, at 10:37 PM, Greg Clayton wrote: > > > >> On Oct 8, 2020, at 10:29 PM, Jason Molenda wrote: >> >> >> >>> On Oct 8, 2020, at 10:06 PM, Jason Molenda wrote: >>> >>> >>> On Oct 8, 2020, at 9:17 PM, Greg Clayton wrote: > On Oct 8, 2020, at 8:55 PM, Jason Molenda wrote: > > Good bug find! > > It seems to me that DWARFCallFrameInfo should push the initial CIE > register setup instructions as being the state at offset 0 in the > function (in fact I'd say it's a bug if it's not). If that were done, > then getting RowAtIndex(0) should be synonymous with > get-the-CIE-register-unwind-rules, and this code would be correct. > > Looking at DWARFCallFrameInfo::FDEToUnwindPlan, we do > > unwind_plan.SetPlanValidAddressRange(range); > UnwindPlan::Row *cie_initial_row = new UnwindPlan::Row; > *cie_initial_row = cie->initial_row; > UnwindPlan::RowSP row(cie_initial_row); > > unwind_plan.SetRegisterKind(GetRegisterKind()); > unwind_plan.SetReturnAddressRegister(cie->return_addr_reg_num); > > cie->initial_row is set by DWARFCallFrameInfo::HandleCommonDwarfOpcode. > > I think the bug here is DWARFCallFrameInfo::FDEToUnwindPlan not pushing > that initial row at offset 0, isn't it? We don't really use DWARF CFI on > darwin any more so I don't have a lot of real world experience here. The only opcodes that push a row are DW_CFA_advance_locXXX and DW_CFA_set_loc, so I don't think that is the right fix. I think we need to pass a copy of just the registers from the "cie->initial_row" object around to the opcode parsing code for restoration purposes. >>> >>> >>> I think everything I'm saying here is besides the point, though. Unless we >>> ALWAYS push the initial unwind state (from the CIE) to an UnwindPlan, the >>> DW_CFA_restore is not going to work. If an unwind rule for r12 says >>> "DW_CFA_restore" and at offset 0 in the function, the unwind rule for r12 >>> was "same" (i.e. no rule), but we return the RowAtIndex(0) and the first >>> instruction, I don't know, spills it or something, then the DW_CFA_restore >>> would set the r12 rule to "r12 was spilled" instead of "r12 is same". >>> >>> So the only way DW_CFA_restore would behave correctly, with this, is if we >>> always push a Row at offset 0 with the rules from the CIE, or with no rules >>> at all, just the initial unwind state showing how the CFA is set and no >>> register rules. >> >> >> >> just to be clear, though, my initial reaction to this bug is "we should >> always push a row at offset 0." I don't want to sound dumb or anything, but >> I don't understand how unwinding would work if we didn't have a Row at >> offset 0. You step into the function, you're at the first instruction, you >> want to find the caller stack frame, and without knowing the rule for >> establishing the CFA and finding the saved pc, I don't know how you get >> that. And the only way to get the CFA / saved pc rule is to get the Row. >> Do we really not have a Row at offset 0 when an UnwindPlan is created from >> CFI? I might be forgetting some trick of UnwindPlans, but I don't see how >> it works. > > What you are saying makes sense, but that isn't how it is encoded. A quick > example: > > 0010 CIE > Version: 4 > Augmentation: "" > Address size: 4 > Segment desc size: 0 > Code alignment factor: 1 > Data alignment factor: -4 > Return address column: 14 > > DW_CFA_def_cfa: reg13 +0 > DW_CFA_nop: > DW_CFA_nop: > > 0014 0024 FDE cie= pc=0001bb2c...0001bc90 > DW_CFA_advance_loc: 4 > DW_CFA_def_cfa_offset: +32 > DW_CFA_offset: reg14 -4 > DW_CFA_offset: reg10 -8 > DW_CFA_offset: reg9 -12 > DW_CFA_offset: reg8 -16 > DW_CFA_offset: reg7 -20 > DW_CFA_offset: reg6 -24 > DW_CFA_offset: reg5 -28 > DW_CFA_offset: reg4 -32 > DW_CFA_advance_loc: 2 > DW_CFA_def_cfa_offset: +112 > DW_CFA_nop: > DW_CFA_nop: > > > DW_CFA_advance_loc is what pushes a row. As you can see in the FDE, it is the > first thing it does. Ah, cool, thanks for the example CFI. I believe DWARFCallFrameInfo::FDEToUnwindPlan is doing the right thing here. We start by initializing the local variable 'row' to the CIE's initial register state: UnwindPlan::Row *cie_initial_row = new UnwindPlan::Row; *cie_initial_row = cie->initial_row; UnwindPlan::RowSP row(cie_initial_row); The first instruction we hit is the advance loc, if (primary_opcode) { switch (primary_opcode) { case DW_CFA_advance_loc: // (Row Creation Instruction) unwind_plan.AppendRow(row); UnwindPlan::Row *newrow = new UnwindPlan::Row; *newrow = *row.get(); row.reset(newrow); row->SlideOffset(extended_opcode * code_align); break; an