GDB: etm traces decoding and breakpoints for arm targets

2020-11-02 Thread Zied Guermazi


hi,

while testing the implementation in gdb of branch tracing on arm 
processors using etm, I faced the the situation where a breakpoint was 
set, was hit and then the execution of the program was continued.  While 
decoding generated traces,  I got the address of the breakpoint 
(0x400552) executed twice, and then the following address (0x400554) 
also executed twice. the instruction at (0x400554) is a BL ( a function 
call) and the second execution corrupts the function history.


here is a dump of generated trace elements


-
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400552
end addr   = 0x400554
instructions count = 1
last_i_type: OCSD_INSTR_OTHER
last_i_subtype: OCSD_S_INSTR_NONE
last instruction was executed
last instruction size: 2
-
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400552
end addr   = 0x400554
instructions count = 1
last_i_type: OCSD_INSTR_OTHER
last_i_subtype: OCSD_S_INSTR_NONE
last instruction was executed
last instruction size: 2
-
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400554
end addr   = 0x400558
instructions count = 1
last_i_type: OCSD_INSTR_BR
last_i_subtype: OCSD_S_INSTR_BR_LINK
last instruction was executed
last instruction size: 4
-
trace_chan_id: 18
isa: CS_ETM_ISA_T32
start addr = 0x400554
end addr   = 0x400558
instructions count = 1
last_i_type: OCSD_INSTR_BR
last_i_subtype: OCSD_S_INSTR_BR_LINK
last instruction was executed
last instruction size: 4

the explanation I have for this behavior is that :

-when setting the software breakpoint, the memory content of the 
instruction (at 0x400552) was altered to the instruction BKPT,


-when the breakpoint was hit, the original opcode was set at (0x400552) 
and a BKPT was set to the next instruction address (0x400554), then the 
execution was continued


-when the second breakpoint (0x400554) was hit, the a BKPT opcode was 
set at (0x400552) and the original opcode was set at (0x400554) then the 
execution was continued


I am using the function "int target_read_code (CORE_ADDR memaddr, 
gdb_byte *myaddr, ssize_t len)" to give program memory content to the 
decoder. so the collected etm traces are correct, but, as memory was 
altered in between, the decoder is "cheated".


I need to identify the re-execution of code due to breakpoint handling, 
and roll back its impact on etm decoding.


is there a mean to get the actual content of program memory including 
patched addresses?


is there a means of getting the history of patched addresses during the 
debugging of a program?


what is the type and subtype of a BKPT instruction in a decoded trace 
elements?


do you have any other idea for handling this situation?


I am attaching the source code of the program as well as the 
disassembled binary. the code was compiled as an application running on 
linux on an ARMv7 A (STM32MP157 SoC). the breakpoint was set at line 43 
in the source code (line 238 in the disassembled code)



Kind Regards

Zied Guermazi



function_call_history: file format elf32-littlearm


Disassembly of section .init:

0380 <_init>:
 380:   e92d4008push{r3, lr}
 384:   eb23bl  418 
 388:   e8bd8008pop {r3, pc}

Disassembly of section .plt:

038c <.plt>:
 38c:   e52de004push{lr}; (str lr, [sp, #-4]!)
 390:   e59fe004ldr lr, [pc, #4]; 39c <.plt+0x10>
 394:   e08fe00eadd lr, pc, lr
 398:   e5bef008ldr pc, [lr, #8]!
 39c:   00010c2c.word   0x00010c2c

03a0 <__cxa_finalize@plt>:
 3a0:   e28fc600add ip, pc, #0, 12
 3a4:   e28cca10add ip, ip, #16, 20 ; 0x1
 3a8:   e5bcfc2cldr pc, [ip, #3116]!; 0xc2c

03ac <__libc_start_main@plt>:
 3ac:   e28fc600add ip, pc, #0, 12
 3b0:   e28cca10add ip, ip, #16, 20 ; 0x1
 3b4:   e5bcfc24ldr pc, [ip, #3108]!; 0xc24

03b8 <__gmon_start__@plt>:
 3b8:   e28fc600add ip, pc, #0, 12
 3bc:   e28cca10add ip, ip, #16, 20 ; 0x1
 3c0:   e5bcfc1cldr pc, [ip, #3100]!; 0xc1c

03c4 :
 3c4:   e28fc600add ip, pc, #0, 12
 3c8:   e28cca10add ip, ip, #16, 20 ; 0x1
 3cc:   e5bcfc14ldr pc, [ip, #3092]!; 0xc14

Disassembly of section .text:

03d0 <_start>:
 3d0:   f04f 0b00   mov.w   fp, #0
 3d4:   f04f 0e00   mov.w   lr, #0
 3d8:   bc02pop {r1}
 3da:   466amov r2, sp
 3dc:   b404push{r2}
 3de:   b401push{r0}
 3e0:   f8df a024   ldr.w   sl, [pc, #36]   ; 408 <_start+0x38>
 3e4:   a308add r3, pc, #32 ; (adr r3, 408 <_start+0x38>)
 3e6:   449aadd sl, r3
 3e8:   f8df c020   ldr.w   ip, [pc, #32]   ; 40c <_start+0x3c>
 3ec:   f85a c00c   ldr.w   ip, [sl, 

Re: GDB: etm traces decoding and breakpoints for arm targets

2020-11-02 Thread Omair Javaid
Hi Zied

From what I understood from your description, you are looking for a way to
mitigate the effects of BKPT instruction in trace data. Also the
description you have about how software breakpoints work is correct. We
write a trap (usually BKPT instruction or any variant of BKPT) to the
breakpoint address.

Instruction at breakpoint address can be of three types: Arm32, Thumb32 and
Thumb16, We need to put a trap instruction accordingly and when trap is
reported need to replace 16 or 32 bytes with the original instruction and
perform a single step. Please take a look at gdbserver source file:
linux-aarch32-low.cc:arm_breakpoint_kind_from_pc in binutils-gdb/gdbserver
for details on which instructions are used by gdb for setting breakpoints
on arm.

I think you should run a gdb remote debug session with packet logging
turned on to better understand whats going on underneath and mitigate its
effect in trace content accordingly.

Use following command to enable RSP packet log:
set debug remote 1

Moreover you should be able to correctly tell between Arm and thumb mode,
your stream log suggests assuming T32 ISA while your function seems to be
compiled in Thumb16 code.

I hope this helps.

-- 
Omair Javaid
www.linaro.org


On Mon, 2 Nov 2020 at 15:06, Zied Guermazi  wrote:

>
> hi,
>
> while testing the implementation in gdb of branch tracing on arm
> processors using etm, I faced the the situation where a breakpoint was
> set, was hit and then the execution of the program was continued.  While
> decoding generated traces,  I got the address of the breakpoint
> (0x400552) executed twice, and then the following address (0x400554)
> also executed twice. the instruction at (0x400554) is a BL ( a function
> call) and the second execution corrupts the function history.
>
> here is a dump of generated trace elements
>
>
> -
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400552
> end addr   = 0x400554
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> -
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400552
> end addr   = 0x400554
> instructions count = 1
> last_i_type: OCSD_INSTR_OTHER
> last_i_subtype: OCSD_S_INSTR_NONE
> last instruction was executed
> last instruction size: 2
> -
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400554
> end addr   = 0x400558
> instructions count = 1
> last_i_type: OCSD_INSTR_BR
> last_i_subtype: OCSD_S_INSTR_BR_LINK
> last instruction was executed
> last instruction size: 4
> -
> trace_chan_id: 18
> isa: CS_ETM_ISA_T32
> start addr = 0x400554
> end addr   = 0x400558
> instructions count = 1
> last_i_type: OCSD_INSTR_BR
> last_i_subtype: OCSD_S_INSTR_BR_LINK
> last instruction was executed
> last instruction size: 4
>
> the explanation I have for this behavior is that :
>
> -when setting the software breakpoint, the memory content of the
> instruction (at 0x400552) was altered to the instruction BKPT,
>
> -when the breakpoint was hit, the original opcode was set at (0x400552)
> and a BKPT was set to the next instruction address (0x400554), then the
> execution was continued
>
> -when the second breakpoint (0x400554) was hit, the a BKPT opcode was
> set at (0x400552) and the original opcode was set at (0x400554) then the
> execution was continued
>
> I am using the function "int target_read_code (CORE_ADDR memaddr,
> gdb_byte *myaddr, ssize_t len)" to give program memory content to the
> decoder. so the collected etm traces are correct, but, as memory was
> altered in between, the decoder is "cheated".
>
> I need to identify the re-execution of code due to breakpoint handling,
> and roll back its impact on etm decoding.
>
> is there a mean to get the actual content of program memory including
> patched addresses?
>
> is there a means of getting the history of patched addresses during the
> debugging of a program?
>
> what is the type and subtype of a BKPT instruction in a decoded trace
> elements?
>
> do you have any other idea for handling this situation?
>
>
> I am attaching the source code of the program as well as the
> disassembled binary. the code was compiled as an application running on
> linux on an ARMv7 A (STM32MP157 SoC). the breakpoint was set at line 43
> in the source code (line 238 in the disassembled code)
>
>
> Kind Regards
>
> Zied Guermazi
>
>
> ___
> linaro-toolchain mailing list
> linaro-toolchain@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-toolchain
>
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain