On 08/17/2018 06:10 PM, Mark Wielaard wrote:
Hi Robert,

[I don't have very good internet connectivity so cannot easily get all
  the bits and sources to replicate/inspect. So apologies if I am
  misinterpreting something.]

On Fri, Aug 17, 2018 at 04:25:07PM +0800, Robert Yang wrote:
On 08/17/2018 03:25 AM, Mark Wielaard wrote:
On Thu, Aug 16, 2018 at 10:34:23AM +0800, Robert Yang wrote:
The one which actually saves the data is data_list.data.d.d_buf, so check it
before free rawdata_base.

This can fix a segmentation fault when prelink libqb_1.0.3:
prelink: /usr/lib/libqb.so.0.18.2: Symbol section index outside of section 
numbers

The segmentation fault happens when prelink call elf_end().

Could you run your reproducer under valgrind and show what it
says before your patch? And/Or post the file (libqb) to replicate
the reproducer somewhere to see exactly what goes wrong?

I don't fully understand what is going wrong. Is the section data
pointing to the file data or something created by elf_newdata?

Thanks for the reply, I found this problem in a cross build environment,
but we are using elfutils as native tool (directly running on host), its
version is 0.172, srcrev=01e87ab4c5a6a249c04e22a97a4221d3.

$ VALGRIND_LIB=/path/to/usr/lib/valgrind valgrind prelink --root
/path/to/core-image-minimal/1.0-r0/rootfs -amR -N -c /etc/prelink.conf
--dynamic-linker /lib/ld-linux-x86-64.so.2 -v

Here are the problems related to elfutils:

prelink: /usr/lib/libqb.so.0.19.0: Symbol section index outside of section 
numbers
==25330== Invalid free() / delete / delete[] / realloc()
==25330==    at 0x4C3026B: free (in 
/path/to/recipe-sysroot-native/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25330==    by 0x50991F5: elf_end (elf_end.c:171)
==25330==    by 0x41E916: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x422233: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x408BE1: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x409015: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x4038FB: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x52CF04A: (below main) (in 
/buildarea1/lyang1/test_up/tmp/sysroots-uninative/x86_64-linux/lib/libc-2.27.so)
==25330==  Address 0x9305300 is 0 bytes inside a block of size 20 free'd
==25330==    at 0x4C3026B: free (in 
/path/to/recipe-sysroot-native/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25330==    by 0x41E8B7: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x422233: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x408BE1: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x409015: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x4038FB: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x52CF04A: (below main) (in 
/buildarea1/lyang1/test_up/tmp/sysroots-uninative/x86_64-linux/lib/libc-2.27.so)
==25330==  Block was alloc'd at
==25330==    at 0x4C2F03F: malloc (in 
/path/to/recipe-sysroot-native/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25330==    by 0x509DE8E: __libelf_set_rawdata_wrlock (elf_getdata.c:329)
==25330==    by 0x509E20E: __elf_getdata_rdlock (elf_getdata.c:532)
==25330==    by 0x509E24D: elf_getdata (elf_getdata.c:559)
==25330==    by 0x420A70: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x413860: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x408D64: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x409015: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x4038FB: ??? (in 
/path/to/recipe-sysroot-native/usr/sbin/prelink)
==25330==    by 0x52CF04A: (below main) (in 
/buildarea1/lyang1/test_up/tmp/sysroots-uninative/x86_64-linux/lib/libc-2.27.so)

Thanks, that does suggest to me there is a bug in prelink.
Although it isn't completely clear. If you could run the same with
prelink debuginfo so we can see the source lines that would be great.

The reason I think this is a prelink issues is because it looks like
it is calling elf_getdata () to get the data, and then frees the buffer.
The idea is that you only own the data of an elf section if you created
it yourself with elf_newdata (). Otherwise, as seems to have happened
here, libelf owns the data buffer and has to free it.

I think that prelink is preparing the ELF file, but then half way
through encounters an error. It then frees all the ELF section data it
believed it created itself. But because of the error it probably didn't
(yet) do that. And so frees some data that it got directly from
elf_getdata () and didn't create itself. Then it calls elf_end () and
libelf also thinks it owns that data and frees it again.

It probably only happens when prelink encounters some other issue while
processing a file.

Yes, I think so. I also asked libqp community for help:

https://github.com/ClusterLabs/libqb/issues/314

Nothing is wrong if I use topic-no-ldsection branch.

I had used gdb to debug prelink and elfutils before, but didn't find the
root cause. Here is result of valgrind with prelink's debuginfo:

Prelinking /usr/lib/libqb.so.0.19.0
prelink: /usr/lib/libqb.so.0.19.0: Symbol section index outside of section 
numbers
==124227== Invalid free() / delete / delete[] / realloc()
==124227== at 0x4C3026B: free (in /path/to/1.0-r0/recipe-sysroot-native/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==124227==    by 0x50991F5: elf_end (elf_end.c:171)
==124227==    by 0x41E65B: close_dso_1 (dso.c:1680)
==124227==    by 0x421966: close_dso (dso.c:1702)
==124227==    by 0x408526: prelink_ent (doit.c:236)
==124227==    by 0x4085D8: prelink_all (doit.c:255)
==124227==    by 0x412AB0: main (main.c:565)
==124227==  Address 0x9305300 is 0 bytes inside a block of size 20 free'd
==124227== at 0x4C3026B: free (in /path/to/1.0-r0/recipe-sysroot-native/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==124227==    by 0x41E600: close_dso_1 (dso.c:1669)
==124227==    by 0x421966: close_dso (dso.c:1702)
==124227==    by 0x408526: prelink_ent (doit.c:236)
==124227==    by 0x4085D8: prelink_all (doit.c:255)
==124227==    by 0x412AB0: main (main.c:565)
==124227==  Block was alloc'd at
==124227== at 0x4C2F03F: malloc (in /path/to/1.0-r0/recipe-sysroot-native/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==124227==    by 0x509DE8E: __libelf_set_rawdata_wrlock (elf_getdata.c:329)
==124227==    by 0x509E20E: __elf_getdata_rdlock (elf_getdata.c:532)
==124227==    by 0x509E24D: elf_getdata (elf_getdata.c:559)
==124227==    by 0x42031B: reopen_dso (dso.c:881)
==124227==    by 0x413532: prelink_prepare (prelink.c:345)
==124227==    by 0x4081FB: prelink_ent (doit.c:128)
==124227==    by 0x4085D8: prelink_all (doit.c:255)
==124227==    by 0x412AB0: main (main.c:565)
==124227==
Prelinking /sbin/udevd
prelink: /sbin/udevd: section file offsets not monotonically increasing
Prelinking /bin/busybox.nosuid
prelink: /bin/busybox.nosuid: section file offsets not monotonically increasing
prelink: Could not prelink /usr/sbin/qb-blackbox because its dependency /usr/lib/libqb.so.0 could not be prelinked
Prelinking /bin/mountpoint.sysvinit
Prelinking /usr/bin/utmpdump.sysvinit
prelink: /usr/bin/utmpdump.sysvinit: section file offsets not monotonically increasing
Prelinking /bin/busybox.suid
Prelinking /lib/udev/v4l_id
Prelinking /sbin/bootlogd
Prelinking /usr/bin/wall.sysvinit
prelink: /usr/bin/wall.sysvinit: section file offsets not monotonically 
increasing
Prelinking /usr/bin/mesg.sysvinit
Prelinking /sbin/halt.sysvinit
==124227==
==124227== HEAP SUMMARY:
==124227==     in use at exit: 228,829 bytes in 696 blocks
==124227==   total heap usage: 5,145 allocs, 4,471 frees, 8,278,606 bytes 
allocated
==124227==
==124227== LEAK SUMMARY:
==124227==    definitely lost: 142,624 bytes in 420 blocks
==124227==    indirectly lost: 0 bytes in 0 blocks
==124227==      possibly lost: 0 bytes in 0 blocks
==124227==    still reachable: 86,205 bytes in 276 blocks
==124227==         suppressed: 0 bytes in 0 blocks
==124227== Rerun with --leak-check=full to see details of leaked memory
==124227==
==124227== For counts of detected and suppressed errors, rerun with: -v
==124227== Use --track-origins=yes to see where uninitialised values come from
==124227== ERROR SUMMARY: 28 errors from 4 contexts (suppressed: 0 from 0)

// Robert


Cheers,

Mark

Reply via email to