Hello, I tried something different and the stack trace now looks more reasonable.
Previously, I was running /lib/ld.so in gdb, using 'set exec-wrapper env LD_DEBUG=bindings' to set LD_DEBUG in the inferior and finally 'run /bin/ls' to start ls. I'm now running gdb directly on ls (while still using exec-wrapper to set LD_DEBUG) and the stack traces are quite different. For 'bindings': (gdb) bt #0 __io_write (io_object=135, data=0x2d770 " 0:\tbinding file /lib/i386-gnu/libc.so.0.3 [0] to /lib/i386-gnu/libc.so.0.3 [0]: normal symbol `__mach_msg'", dataCnt=115, offset=-1, amount=0x2d80c) at /opt/stash/diego/src/debian/glibc/glibc-2.22/build-tree/hurd-i386-libc/hurd/RPC_io_write.c:140 #1 0x0001780a in __writev (fd=<optimized out>, iov=<optimized out>, niov=<optimized out>) at ../sysdeps/mach/hurd/dl-sysdep.c:429 #2 0x00010643 in _dl_writev (niov=14, iov=0x2d8c0, fd=2) at ./dl-writev.h:54 #3 _dl_debug_vdprintf (fd=2, tag_p=-1, tag_p@entry=1, fmt=0x2200d "", fmt@entry=0x21fdc "binding file %s [%lu] to %s [%lu]: %s symbol `%s'", arg=0x2db1c "") at dl-misc.c:244 #4 0x00010a59 in _dl_debug_printf ( fmt=0x21fdc "binding file %s [%lu] to %s [%lu]: %s symbol `%s'") at dl-misc.c:255 #5 0x0000a72b in _dl_debug_bindings (protected=0, type_class=1, version=0x1239150, value=0x2db60, ref=0x2dbec, undef_map=0x10327f0, undef_name=0x10483b3 "__mach_msg") at dl-lookup.c:1005 #6 _dl_lookup_symbol_x (undef_name=undef_name@entry=0x10483b3 "__mach_msg", undef_map=undef_map@entry=0x10327f0, ref=ref@entry=0x2dbec, symbol_scope=0x10329a8, version=0x1239150, type_class=1, flags=5, skip_map=0x0) at dl-lookup.c:940 #7 0x0000f008 in _dl_fixup (l=0x10327f0, reloc_arg=<optimized out>) at dl-runtime.c:111 #8 0x00015c80 in _dl_runtime_resolve () at ../sysdeps/i386/dl-trampoline.S:43 #9 0x01073ae3 in ?? () (gdb) frame #0 __io_write (io_object=145, data=0x2d770 " 0:\tbinding file /lib/i386-gnu/libc.so.0.3 [0] to /lib/i386-gnu/libc.so.0.3 [0]: normal symbol `__mach_msg'", dataCnt=115, offset=-1, amount=0x2d80c) at /opt/stash/diego/src/debian/glibc/glibc-2.22/build-tree/hurd-i386-libc/hurd/RPC_io_write.c:140 140 InP->dataType = dataType; (gdb) disassemble $eip,+8 Dump of assembler code from 0x1affa to 0x1b002: => 0x0001affa <__io_write+26>: movzwl 0x2a(%esp),%eax 0x0001afff <__io_write+31>: mov 0x868(%esp),%esi End of assembler dump. For 'symbols': (gdb) bt #0 __io_write (io_object=141, data=0x2d068 " 0:\tsymbol=__mach_msg; lookup in file=/bin/ls [0]\n", dataCnt=59, offset=-1, amount=0x2d0c4) at /opt/stash/diego/src/debian/glibc/glibc-2.22/build-tree/hurd-i386-libc/hurd/RPC_io_write.c:140 #1 0x0001780a in __writev (fd=<optimized out>, iov=<optimized out>, niov=<optimized out>) at ../sysdeps/mach/hurd/dl-sysdep.c:429 #2 0x00010643 in _dl_writev (niov=8, iov=0x2d158, fd=2) at ./dl-writev.h:54 #3 _dl_debug_vdprintf (fd=2, tag_p=tag_p@entry=1, fmt=0x21f0c "", fmt@entry=0x21ee8 "symbol=%s; lookup in file=%s [%lu]\n", arg=0x2d3a8 "") at dl-misc.c:244 #4 0x00010a59 in _dl_debug_printf ( fmt=0x21ee8 "symbol=%s; lookup in file=%s [%lu]\n") at dl-misc.c:255 #5 0x000097ca in do_lookup_x ( undef_name=undef_name@entry=0x10483b3 "__mach_msg", new_hash=new_hash@entry=2703160418, old_hash=old_hash@entry=0x2d490, ref=0x103f23c, result=0x2d498, scope=0x2bc14, i=0, version=0x1239150, flags=5, skip=0x0, type_class=1, undef_map=0x10327f0) at dl-lookup.c:382 #6 0x0000a32d in _dl_lookup_symbol_x ( undef_name=undef_name@entry=0x10483b3 "__mach_msg", undef_map=undef_map@entry=0x10327f0, ref=ref@entry=0x2d524, symbol_scope=0x10329a8, version=0x1239150, type_class=1, flags=5, skip_map=0x0) at dl-lookup.c:829 #7 0x0000f008 in _dl_fixup (l=0x10327f0, reloc_arg=<optimized out>) at dl-runtime.c:111 #8 0x00015c80 in _dl_runtime_resolve () at ../sysdeps/i386/dl-trampoline.S:43 #9 0x01073ae3 in ?? () (gdb) frame #0 __io_write (io_object=152, data=0x2d068 " 0:\tsymbol=__mach_msg; lookup in file=/bin/ls [0]\n", dataCnt=59, offset=-1, amount=0x2d0c4) at /opt/stash/diego/src/debian/glibc/glibc-2.22/build-tree/hurd-i386-libc/hurd/RPC_io_write.c:140 140 InP->dataType = dataType; (gdb) disassemble $eip,+8 Dump of assembler code from 0x1affa to 0x1b002: => 0x0001affa <__io_write+26>: movzwl 0x2a(%esp),%eax 0x0001afff <__io_write+31>: mov 0x868(%esp),%esi End of assembler dump. And using 'statistics': (gdb) bt #0 0x00017802 in __writev (fd=<optimized out>, iov=<optimized out>, niov=<optimized out>) at ../sysdeps/mach/hurd/dl-sysdep.c:429 #1 0x00010643 in _dl_writev (niov=12, iov=0x102ca90, fd=2) at ./dl-writev.h:54 #2 _dl_debug_vdprintf (fd=2, tag_p=tag_p@entry=1, fmt=0x22464 "", fmt@entry=0x223f0 "\nruntime linker statistics:\n", ' ' <repeats 11 times>, "final number of relocations: %lu\nfinal number of relocations from cache: %lu\n", arg=0x102ccdc "\017\001\001") at dl-misc.c:244 #3 0x00010a59 in _dl_debug_printf ( fmt=0x223f0 "\nruntime linker statistics:\n", ' ' <repeats 11 times>, "final number of relocations: %lu\nfinal number of relocations from cache: %lu\n") at dl-misc.c:255 #4 0x0001040f in _dl_fini () at dl-fini.c:290 #5 0x01097597 in __run_exit_handlers (status=0, listp=0x121a3fc <__exit_funcs>, run_list_atexit=true) at exit.c:82 #6 0x010975df in exit (status=0) at exit.c:104 #7 0x01080b96 in __libc_start_main (main=0x8049cd0, argc=1, argv=0x102ce34, init=0x805b190, fini=0x805b1f0, rtld_fini=0xff50 <_dl_fini>, stack_end=0x102ce2c) at libc-start.c:325 #8 0x0804b5dd in ?? () (gdb) frame #0 0x00017802 in __writev (fd=<optimized out>, iov=<optimized out>, niov=<optimized out>) at ../sysdeps/mach/hurd/dl-sysdep.c:429 429 err = __io_write (_hurd_init_dtable[fd], buf, total, -1, &nwrote); (gdb) disassemble $eip,+8 Dump of assembler code from 0x17802 to 0x1780a: => 0x00017802 <__writev+194>: pushl (%eax,%edi,4) 0x00017805 <__writev+197>: call 0x1afe0 <__io_write> End of assembler dump. Now the stack traces look consitent with the observation that after commenting out print operations segfaults are not triggered. On Tue, Mar 15, 2016 at 10:30:07PM +0100, Samuel Thibault wrote: > > In both cases it fails while trying to write to the stack. Does perhaps > backtrace show a long recursion? Or perhaps the stack here is too small > for what dl.so wants to do? > Yes, it may be the latter. Checking the stack position being accessed gives errors in gdb in both cases. bindings: (gdb) x/4c $esp + 0x2a 0x2cf1a: Cannot access memory at address 0x2cf1a symbols: (gdb) x/4c $esp + 0x2a 0x2c812: Cannot access memory at address 0x2c812 I'll check the stack usage of the functions involved in the stack trace. I think I'd also need to check the memory layout of the process to confirm it (I don't know how though :). The 'statistics' case appears to be something different, as if _hurd_init_dtable point to bad memory. In fact: (gdb) x/4c $eax + $edi * 4 0x102f008: Cannot access memory at address 0x102f008 so I think it's not the pushl that's failing, but the indirect memory access. Thanks, Diego