On Mon, May 10, 2010 at 11:09:41AM +1000, vincent.mcint...@csiro.au wrote: > > I changed the setup slightly, connecting the storage unit and the host > to a FC switch. There are still two LUNs and now there are 4 paths to > each. > > # multipath -l > mpath1 (2227300015530e20d) dm-1 Promise ,VTrak E610f > [size=13T][features=1 queue_if_no_path][hwhandler=0] > \_ round-robin 0 [prio=0][active] > \_ 1:0:0:1 sdf 8:80 [active][undef] > \_ 1:0:1:1 sdh 8:112 [active][undef] > \_ 1:0:2:1 sdj 8:144 [active][undef] > \_ 1:0:3:1 sdl 8:176 [active][undef] > mpath0 (2228f000155e2acda) dm-0 Promise ,VTrak E610f > [size=13T][features=1 queue_if_no_path][hwhandler=0] > \_ round-robin 0 [prio=0][active] > \_ 1:0:0:0 sde 8:64 [active][undef] > \_ 1:0:1:0 sdg 8:96 [active][undef] > \_ 1:0:2:0 sdi 8:128 [active][undef] > \_ 1:0:3:0 sdk 8:160 [active][undef] > > I could go back to the other configuration briefly if you wish. > > > > Could you run multipathd under valgrind please? > > I ran it, then tried to stop with the init script (which didn't seem > to work) and then tried with 'kill -HUP' and then just 'kill'. > Somehow the last command caused my shell to attach to the process, > I then hit ^C. > Details below. > > I tried this twice, with and without the filesystems mounted. > Results were similar. Between mounting and running the second time > I briefly exercised the filesystems by copying a bit of data from one to > the other. I didn't try to stop/start under load. > > # /etc/init.d/multipath-tools stop > # ps -fade|grep multi > root 12124 12098 0 10:45 pts/2 00:00:00 grep multi > > # valgrind multipathd > ==12134== Memcheck, a memory error detector. > ==12134== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. > ==12134== Using LibVEX rev 1854, a library for dynamic binary translation. > ==12134== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. > ==12134== Using valgrind-3.3.1-Debian, a dynamic binary instrumentation > framework. > ==12134== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. > ==12134== For more details, rerun with: -v > ==12134== > ==12135== > ==12135== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1) > ==12135== malloc/free: in use at exit: 1,120 bytes in 23 blocks. > ==12135== malloc/free: 48 allocs, 25 frees, 4,716 bytes allocated. > ==12135== For counts of detected errors, rerun with: -v > ==12135== searching for pointers to 23 not-freed blocks. > ==12135== checked 208,864 bytes. > ==12135== > ==12135== LEAK SUMMARY: > ==12135== definitely lost: 0 bytes in 0 blocks. > ==12135== possibly lost: 0 bytes in 0 blocks. > ==12135== still reachable: 1,120 bytes in 23 blocks. > ==12135== suppressed: 0 bytes in 0 blocks. > ==12135== Rerun with --leak-check=full to see details of leaked memory. > ==12134== > ==12134== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1) > ==12134== malloc/free: in use at exit: 216 bytes in 1 blocks. > ==12134== malloc/free: 48 allocs, 47 frees, 4,716 bytes allocated. > ==12134== For counts of detected errors, rerun with: -v > ==12134== searching for pointers to 1 not-freed blocks. > ==12134== checked 208,048 bytes. > ==12134== > ==12134== LEAK SUMMARY: > ==12134== definitely lost: 0 bytes in 0 blocks. > ==12134== possibly lost: 0 bytes in 0 blocks. > ==12134== still reachable: 216 bytes in 1 blocks. > ==12134== suppressed: 0 bytes in 0 blocks. > ==12134== Rerun with --leak-check=full to see details of leaked memory. > > # ps -fade|grep multip > root 12136 1 5 10:45 ? 00:00:00 /usr/bin/valgrind.bin > multipathd > root 12166 12098 0 10:45 pts/2 00:00:00 grep multip > > # /etc/init.d/multipath-tools stop > Stopping multipath daemon: multipathd. > > # ps -fade|grep multip > root 12136 1 2 10:45 ? 00:00:00 /usr/bin/valgrind.bin > multipathd > root 12172 12098 0 10:45 pts/2 00:00:00 grep multip > > # kill -HUP 12136 > # ps -fade|grep multip > root 12136 1 2 10:45 ? 00:00:00 /usr/bin/valgrind.bin > multipathd > root 12190 12098 0 10:45 pts/2 00:00:00 grep multip > > # kill 12136 > ==12136== Thread 9: > ==12136== Invalid read of size 4 > ==12136== at 0x4E2E4FE: pthread_mutex_lock (in /lib/libpthread-2.7.so) > ==12136== by 0x42B596: (within /sbin/multipathd) > ==12136== by 0x42BC83: (within /sbin/multipathd) > ==12136== by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so) > ==12136== by 0x59A959C: clone (in /lib/libc-2.7.so) > ==12136== Address 0x6068788 is 8 bytes inside a block of size 40 free'd > ==12136== at 0x4C2130F: free (vg_replace_malloc.c:323) > ==12136== by 0x415770: xfree (in /sbin/multipathd) > ==12136== by 0x4069F8: (within /sbin/multipathd) > ==12136== by 0x406D98: (within /sbin/multipathd) > ==12136== by 0x58F81A5: (below main) (in /lib/libc-2.7.so) > ==12136== > ==12136== Invalid read of size 4 > ==12136== at 0x4E2E509: pthread_mutex_lock (in /lib/libpthread-2.7.so) > ==12136== by 0x42B596: (within /sbin/multipathd) > ==12136== by 0x42BC83: (within /sbin/multipathd) > ==12136== by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so) > ==12136== by 0x59A959C: clone (in /lib/libc-2.7.so) > ==12136== Address 0x606878c is 12 bytes inside a block of size 40 free'd > ==12136== at 0x4C2130F: free (vg_replace_malloc.c:323) > ==12136== by 0x415770: xfree (in /sbin/multipathd) > ==12136== by 0x4069F8: (within /sbin/multipathd) > ==12136== by 0x406D98: (within /sbin/multipathd) > ==12136== by 0x58F81A5: (below main) (in /lib/libc-2.7.so) > ==12136== > ==12136== Invalid write of size 4 > ==12136== at 0x4E2E50D: pthread_mutex_lock (in /lib/libpthread-2.7.so) > ==12136== by 0x42B596: (within /sbin/multipathd) > ==12136== by 0x42BC83: (within /sbin/multipathd) > ==12136== by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so) > ==12136== by 0x59A959C: clone (in /lib/libc-2.7.so) > ==12136== Address 0x6068788 is 8 bytes inside a block of size 40 free'd > ==12136== at 0x4C2130F: free (vg_replace_malloc.c:323) > ==12136== by 0x415770: xfree (in /sbin/multipathd) > ==12136== by 0x4069F8: (within /sbin/multipathd) > ==12136== by 0x406D98: (within /sbin/multipathd) > ==12136== by 0x58F81A5: (below main) (in /lib/libc-2.7.so) > ==12136== > ==12136== Invalid read of size 8 > ==12136== at 0x42B5E8: (within /sbin/multipathd) > ==12136== by 0x42BC83: (within /sbin/multipathd) > ==12136== by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so) > ==12136== by 0x59A959C: clone (in /lib/libc-2.7.so) > ==12136== Address 0x6068738 is 0 bytes inside a block of size 24 free'd > ==12136== at 0x4C2130F: free (vg_replace_malloc.c:323) > ==12136== by 0x415770: xfree (in /sbin/multipathd) > ==12136== by 0x406A0C: (within /sbin/multipathd) > ==12136== by 0x406D98: (within /sbin/multipathd) > ==12136== by 0x58F81A5: (below main) (in /lib/libc-2.7.so) > ==12136== > ==12136== Invalid read of size 4 > ==12136== at 0x4E2FBF5: __pthread_mutex_unlock_usercnt (in > /lib/libpthread-2.7.so) > ==12136== by 0x42B5EF: (within /sbin/multipathd) > ==12136== by 0x42BC83: (within /sbin/multipathd) > ==12136== by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so) > ==12136== by 0x59A959C: clone (in /lib/libc-2.7.so) > ==12136== Address 0x10 is not stack'd, malloc'd or (recently) free'd > ==12136== > ==12136== > ==12136== Process terminating with default action of signal 11 (SIGSEGV) > ==12136== Access not within mapped region at address 0x10 > ==12136== at 0x4E2FBF5: __pthread_mutex_unlock_usercnt (in > /lib/libpthread-2.7.so) > ==12136== by 0x42B5EF: (within /sbin/multipathd) > ==12136== by 0x42BC83: (within /sbin/multipathd) > ==12136== by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so) > ==12136== by 0x59A959C: clone (in /lib/libc-2.7.so) > ==12136== > ==12136== ERROR SUMMARY: 6 errors from 5 contexts (suppressed: 9 from 2) > ==12136== malloc/free: in use at exit: 13,258 bytes in 44 blocks. > ==12136== malloc/free: 3,990 allocs, 3,946 frees, 3,016,169 bytes allocated. > ==12136== For counts of detected errors, rerun with: -v > ==12136== searching for pointers to 44 not-freed blocks. > ==12136== checked 399,480 bytes. > ==12136== > ==12136== LEAK SUMMARY: > ==12136== definitely lost: 0 bytes in 0 blocks. > ==12136== possibly lost: 1,152 bytes in 4 blocks. > ==12136== still reachable: 12,106 bytes in 40 blocks. > ==12136== suppressed: 0 bytes in 0 blocks. > ==12136== Rerun with --leak-check=full to see details of leaked memory. > ^C Hmm...no idea what causes this. I can't reproduce this here. Could you rebuild multipath-tools with DEB_BUILD_OPTIONS=debug and run it in gdb? Or is there a chance I can logon the box to check myself? -- Guido
-- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org