[valgrind] [Bug 405295] valgrind 3.14.0 dies due to mysterious DWARF information? (output from rust used by Mozilla TB.)

zephyrus00jp Tue, 09 Apr 2019 01:01:46 -0700

https://bugs.kde.org/show_bug.cgi?id=405295


--- Comment #7 from zephyrus00jp <ishik...@yk.rim.or.jp> ---
This is what I found.

(A side note: under 
4.19.0-1-amd64 #1 SMP Debian 4.19.12-1 (2018-12-22) x86_64 GNU/Linux,
I could run very old 32-bit TB 22.0a1 (2013-03-20)
under valgrind-3.15.0.RC1.)

However, under the same OS, I could not run 2.9.1 (64-bit) (the official
release, not the one I built locally).

The segfault seems to occur in the dynamically generated code. (or in a
dynamically shared libyrary? I am not sure).

gdb valgrind

(gdb) run --smc-check=all-non-file --fair-sched=yes --redzone-size=128
--vex-iropt-register-updates=allregs-at-mem-access --trace-children=yes
~ishikawa/thunderbird/thunderbird
Starting program: /usr/local/bin/valgrind --smc-check=all-non-file
--fair-sched=yes --redzone-size=128
--vex-iropt-register-updates=allregs-at-mem-access --trace-children=yes
~ishikawa/thunderbird/thunderbird
process 30378 is executing new program:
/usr/local/lib/valgrind/memcheck-amd64-linux
==30378== Memcheck, a memory error detector
==30378== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==30378== Using Valgrind-3.15.0.RC1 and LibVEX; rerun with -h for copyright
info
==30378== Command: /home/ishikawa/thunderbird/thunderbird
==30378==

Program received signal SIGSEGV, Segmentation fault.

(gdb) where
#0  0x00000010039fa56c in ?? ()
#1  0x0000001002eadf30 in ?? ()
#2  0x0000001002008380 in ?? ()
#3  0x0000001002eadf18 in ?? ()
#4  0x0000001002eadf30 in ?? ()
#5  0x0000001002eadf40 in ?? ()
#6  0x0000000000000000 in ?? ()

(gdb) info files
Symbols from "/usr/local/lib/valgrind/memcheck-amd64-linux".
Native process:
    Using the running image of child process 30378.
    While running this, GDB does not access memory from...
Local exec file:
    `/usr/local/lib/valgrind/memcheck-amd64-linux', file type elf64-x86-64.
    Entry point: 0x580bac60
    0x0000000058000158 - 0x000000005800017c is .note.gnu.build-id
    0x0000000058000180 - 0x00000000581e9d9a is .text
    0x00000000581e9da0 - 0x000000005825831a is .rodata
    0x0000000058258320 - 0x0000000058286048 is .eh_frame
    0x0000000058287860 - 0x0000000058289f60 is .data.rel.ro.local
    0x0000000058289f60 - 0x0000000058289f90 is .data.rel.ro
    0x0000000058289f90 - 0x0000000058289fe8 is .got
    0x0000000058289fe8 - 0x000000005828a000 is .got.plt
    0x000000005828a000 - 0x000000005828c420 is .data
    0x000000005828c440 - 0x0000000059c8fab9 is .bss

So obviously, the PC location in the stacktrace is not within the code of
valgrind.


(gdb) info reg
rax            0x1002da7bfe        68767349758
rbx            0x0                 0
rcx            0xffffaaaa          4294945450
rdx            0x1002da4000        68767334400
rsi            0x59298d60          1495895392
rdi            0x1ffeffeff8        137422172152
rbp            0x1002008390        0x1002008390
rsp            0x1002eade00        0x1002eade00
r8             0x180d2             98514
r9             0x10058a5710        68812429072
r10            0x4029190           67277200
r11            0x58010f90          1476464528
r12            0x1002eadf40        68768423744
r13            0x1002eadf30        68768423728
r14            0x1002eadf18        68768423704
r15            0x1ffeffeff8        137422172152
rip            0x10039fa56c        0x10039fa56c
eflags         0x10246             [ PF ZF IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
(gdb)

It seems to me that the crash occurs in code (dynamically generated in heap?).

It is possible that the heap or rather the stack frame got mangled by the time
this segmentation error occurs.

(gdb)  disassemble 0x10039fa56c,0x10039fa580
Dump of assembler code from 0x10039fa56c to 0x10039fa580:
=> 0x00000010039fa56c:  mov    %r10,(%r15)
   0x00000010039fa56f:  movq   $0x40088a4,0xb8(%rbp)
   0x00000010039fa57a:  sub    $0x8,%r15
   0x00000010039fa57e:  movq   $0x0,0x3d0(%rbp)
End of assembler dump.
(gdb) bt

the value of r15 seems to be near the end of the sbrk'ed address (from the
previous runs where I checked the
system calls by using strace previously), so we may be
accessing an unmapped memory area from the dynamically generated code?

The backtrace looks a bit strange:
Only the most up to date PC seems to contain a valid instruction. (Well, it is
possible that this
is in a signal handler and thus the backtrace may not be quite correct.)

(gdb) bt
#0  0x00000010039fa56c in ?? ()
#1  0x0000001002eadf30 in ?? ()
#2  0x0000001002008380 in ?? ()
#3  0x0000001002eadf18 in ?? ()
#4  0x0000001002eadf30 in ?? ()
#5  0x0000001002eadf40 in ?? ()
#6  0x0000000000000000 in ?? ()
(gdb) disassemble 0x1002eadf30,0x1002eadf40
Dump of assembler code from 0x1002eadf30 to 0x1002eadf40:
   0x0000001002eadf30:    add    %al,(%rax)
   0x0000001002eadf32:    add    %al,(%rax)
   0x0000001002eadf34:    add    %al,(%rax)
   0x0000001002eadf36:    add    %al,(%rax)
   0x0000001002eadf38:    add    %al,(%rax)
   0x0000001002eadf3a:    add    %al,(%rax)
   0x0000001002eadf3c:    add    %al,(%rax)
   0x0000001002eadf3e:    add    %al,(%rax)
End of assembler dump.
(gdb) disassemble 0x1002008380,0x10020083a0
Dump of assembler code from 0x1002008380 to 0x10020083a0:
   0x0000001002008380:    add    %eax,(%rax)
   0x0000001002008382:    add    %al,(%rax)
   0x0000001002008384:    add    (%rax),%al
   0x0000001002008386:    add    %al,(%rax)
   0x0000001002008388:    add    %al,(%rax)
   0x000000100200838a:    add    %al,(%rax)
   0x000000100200838c:    add    %al,(%rax)
   0x000000100200838e:    add    %al,(%rax)
   0x0000001002008390:    sbb    $0x5809a2,%eax
   0x0000001002008395:    add    %al,(%rax)
   0x0000001002008397:    add    %dl,%cl
   0x0000001002008399:    addb   $0x0,(%rcx)
   0x000000100200839c:    add    %al,(%rax)
   0x000000100200839e:    add    %al,(%rax)
End of assembler dump.
(gdb) disassemble 0x1002eadf18,0x1002eadf28
Dump of assembler code from 0x1002eadf18 to 0x1002eadf28:
   0x0000001002eadf18:    rolb   %cl,0x1(%rax)
   0x0000001002eadf1e:    add    %al,(%rax)
   0x0000001002eadf20:    add    %al,(%rax)
   0x0000001002eadf22:    add    %al,(%rax)
   0x0000001002eadf24:    add    %al,(%rax)
   0x0000001002eadf26:    add    %al,(%rax)
End of assembler dump.
(gdb) quit
A debugging session is active.

    Inferior 1 [process 30378] will be killed.

Quit anyway? (y or n) y
mailtest@debian-vbox-ci:~$

===

I noticed one thing. Sorry I did not pass proper valgrind options in the run
below, but it did not seem to change the result.

I have found out, I can continue past the first two SIGSEGV. This suggests that
the first couple of SIGSEGVs are probably handled properly by signal handler of
valgrind to allocate more memory by means of mmap function, etc.

However, after the third error, I seem to get stuck in the same position.

This is the interaction from that point on:

mailtest@debian-vbox-ci:~$ gdb valgrind
GNU gdb (Debian 8.2.1-2) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from valgrind...done.
(gdb) run ~ishikawa/thunderbird/thunderbird
Starting program: /usr/local/bin/valgrind ~ishikawa/thunderbird/thunderbird
process 30468 is executing new program:
/usr/local/lib/valgrind/memcheck-amd64-linux
==30468== Memcheck, a memory error detector
==30468== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==30468== Using Valgrind-3.15.0.RC1 and LibVEX; rerun with -h for copyright
info
==30468== Command: /home/ishikawa/thunderbird/thunderbird
==30468==

Program received signal SIGSEGV, Segmentation fault.
0x00000010039f8064 in ?? ()
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000001003a6014d in ?? ()
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000001003a6014d in ?? ()
(gdb) where
#0  0x0000001003a6014d in ?? ()
#1  0x0000001002eadf30 in ?? ()
#2  0x0000001002008320 in ?? ()
#3  0x0000001002eadf18 in ?? ()
#4  0x0000001002eadf30 in ?? ()
#5  0x0000001002eadf40 in ?? ()
#6  0x00000000592babc0 in ?? ()
#7  0x0000001002eb1000 in ?? ()
#8  0x000000000001124a in ?? ()
#9  0x0000000000001ef5 in ?? ()
#10 0x0000000000000000 in ?? ()
(gdb) disasm 0x1003a6014d,0x1003a60160
Undefined command: "disasm".  Try "help".
(gdb) disass 0x1003a6014d,0x1003a60160
Dump of assembler code from 0x1003a6014d to 0x1003a60160:
=> 0x0000001003a6014d:    mov    %r10,(%rbx)
   0x0000001003a60150:    movq   $0x40153d3,0xb8(%rbp)
   0x0000001003a6015b:    lea    0x8(%rbx),%r14
   0x0000001003a6015f:    mov    %r14,%rdi
End of assembler dump.
(gdb) info reg
rax            0x1002da73f0        68767347696
rbx            0x1ffeffcfc0        137422163904
rcx            0xffffaaaa          4294945450
rdx            0x1002da4000        68767334400
rsi            0x59298d60          1495895392
rdi            0x1ffeffcfc0        137422163904
rbp            0x1002008330        0x1002008330
rsp            0x1002eade00        0x1002eade00
r8             0x1002da4000        68767334400
r9             0x1a57              6743
r10            0x0                 0
r11            0x58010f90          1476464528
r12            0x1002eadf40        68768423744
r13            0x1002eadf30        68768423728
r14            0x0                 0
r15            0x1ffeffd340        137422164800
rip            0x1003a6014d        0x1003a6014d
eflags         0x10246             [ PF ZF IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
(gdb)


Any advice for further debugging is appreciated.

TIA for your attention.

PS: OK, maybe I am doing it all incorrectly and should invoke the vgdb feature
of VALGRIND. But even then I obtained PC address not within the binary in
question... But I could be wrong.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 405295] valgrind 3.14.0 dies due to mysterious DWARF information? (output from rust used by Mozilla TB.)

Reply via email to