On 08/03/2018 11:46 AM, Mark Wielaard wrote:
> Hi Martin,
>
> On Fri, 2018-08-03 at 09:41 +0200, Martin Liška wrote:
>> As slightly discussed with Mark, there are tests that expect 'main'
>> will be present in backtrace. That's not always true on x86_64
>> because
>> -freorder-blocks-and-partition option is on by default. Then one can
>> see:
>>
>> [ 88s] FAIL: run-backtrace-dwarf.sh
>> [ 88s]
>> [ 88s]
>> [ 88s] 0x7f1fd49800cb raise
>> [ 88s] 0x7f1fd49694e9 abort
>> [ 88s] 0x5627fddd0188 callme
>> [ 88s] 0x5627fddd0192 doit
>> [ 88s] 0x5627fddd01a3 main.cold.1
>> [ 88s] 0x7f1fd496afeb __libc_start_main
>> [ 88s] 0x5627fddd04aa _start
>> [ 88s] /home/abuild/rpmbuild/BUILD/elfutils-0.173/tests/backtrace-
>> dwarf: dwfl_thread_getframes: no error
>> [ 88s] 0x5627fddd01a3 main.cold.1
>>
>> Thus I'm suggesting to disable the option for tests?
>> Thoughts?
>
> So the problem is that some tests look for a 'main' symbol.
> This is imho for C based programs a natural way to see if we can unwind
> to the start of the program (everything before 'main' is infrastructure
> that isn't really relevant to the user). But in some cases the 'main'
> symbol is munged into something else. 'main.cold.1' in this case.
>
> The first question is, does the program also contain a 'main' symbol?
> If so, what does it cover?
> Could you eu-readelf -s tests/backtrace-dwarf | grep main
Yes it does, can be shown with gcc 8.* on x86_64:
$ cat cold.c
int main(int argc, char **argv)
{
if (argc != 111)
__builtin_abort ();
return 0;
}
$ gcc cold.c -O2
$ readelf -s a.out | grep main
2: 0 FUNCGLOBAL DEFAULT UND
__libc_start_main@GLIBC_2.2.5 (2)
37: 00400430 5 FUNCLOCAL DEFAULT 14 main.cold.0
60: 0 FUNCGLOBAL DEFAULT UND
__libc_start_main@@GLIBC_
69: 0040044020 FUNCGLOBAL DEFAULT 14 main
$ gdb ./a.out
r
Starting program: /tmp/a.out
Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50return ret;
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x77a384e9 in __GI_abort () at abort.c:79
#2 0x00400435 in main.cold ()
#3 0x77a39feb in __libc_start_main (main=0x400440 , argc=1,
argv=0x7fffdc88, init=, fini=,
rtld_fini=, stack_end=0x7fffdc78) at ../csu/libc-start.c:308
#4 0x0040048a in _start () at ../sysdeps/x86_64/start.S:120
If using debug info (-g), then it's fine:
Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50return ret;
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x77a384e9 in __GI_abort () at abort.c:79
#2 0x00400435 in main (argc=, argv=) at
cold.c:4
#3 0x77a39feb in __libc_start_main (main=0x400440 , argc=1,
argv=0x7fffdc88, init=, fini=,
rtld_fini=, stack_end=0x7fffdc78) at ../csu/libc-start.c:308
#4 0x0040048a in _start () at ../sysdeps/x86_64/start.S:120
>
> Now if it does, the question is why didn't we see it?
> Is main.cold.1 an alias? Then we probably should look harder/smarter.
> Or does it now cover any of the backtrace addresses?
Maybe because jmp instruction is used instead of call?
00400440 :
400440: 48 83 ec 08 sub$0x8,%rsp
400444: 83 ff 6fcmp$0x6f,%edi
400447: 0f 85 e3 ff ff ff jne400430
40044d: 31 c0 xor%eax,%eax
40044f: 48 83 c4 08 add$0x8,%rsp
400453: c3 retq
400454: 66 2e 0f 1f 84 00 00nopw %cs:0x0(%rax,%rax,1)
40045b: 00 00 00
40045e: 66 90 xchg %ax,%ax
I'm not expert in libbactrace, so maybe true is somewhere else.
Martin
>
> If there isn't, or it isn't actually called, then the question is, is
> that actually legal? It seems, at least for C and C++ based programs
> that they should start in 'main'. If not they are not, is that because
> gcc did an illegal transformation? Or does it only look that way
> because we cannot unwind correctly (did it do some tail call)?
>
> We could just use -freorder-blocks-and-partition. But I would like to
> first really understand why it is necessary.
>
> If you could maybe post the binary somewhere for inspection that would
> be great.
>
> Thanks,
>
> Mark
>