Chet Ramey wrote on Mon, Oct 04, 2021 at 10:11:10PM -0400: > > I'm running busybox sh in a unit (which starts properly), then > > interactively test things from there. > > > > Running in gdb does fail the same way as running normally, so I've also > > been looking at that a bit, but nothing obvious poped up. > > I'd like to trace back which allocation corresponds to the failing one, > > and break from there next time. > > Without the allocation tracing, which you don't get from allocations that > aren't directly from bash, it's tedious, but doable. It only works if the > memory layout is consistent enough that you get the same address failing > every time (that is, the address being passed to the failing free() is the > same). > > If you do, you write it down, put a breakpoint in internal_malloc at the > point where it returns the memory to its caller, and just run, breaking > at that point, until malloc returns that address. Then run a backtrace and > see where you are.
Turns out I'm lucky enough on address consistency.. So, - since we have a nice before/after with systemd, I took a moment to bisect it. It comes down to this commit[1] which is basically using malloc_usable_size() to use buffers beyond what it initially requested [1] https://github.com/systemd/systemd/commit/319a4f4bc46b230fc660321e99aaac1bc449deea - through gdb/printf I see that the failing pointer has been allocated with (what comes down to) malloc(64), but malloc_usable_size returns 108, so systemd happily writes beyond the 64 bytes it originally requested -- which bash allocator treats as an overflow. copying bash's malloc_usable_size here for convenience: ---- malloc_usable_size (mem) void *mem; { register union mhead *p; register char *ap; register int maxbytes; if ((ap = (char *)mem) == 0) return 0; /* Find the true start of the memory block to discover which bin */ p = (union mhead *) ap - 1; if (p->mh_alloc == ISMEMALIGN) { ap -= p->mh_nbytes; p = (union mhead *) ap - 1; } /* XXX - should we return 0 if ISFREE? */ maxbytes = binsize(p->mh_index); /* So the usable size is the maximum number of bytes in the bin less the malloc overhead */ maxbytes -= MOVERHEAD + MSLOP; return (maxbytes); } ---- If I change malloc_usable_size to return p->mh_nbytes instead of maxbytes, then the crash disappears.[2] I did not read the full bash malloc code but I suspect the buffer really could be grown, but we would need to fix p->mh_nbytes to maxbytes and also adjust the end block to pass sanity checks on free -- e.g. it should be considered as a lightweight inplace realloc. I'm not sure we care enough to be honest and returning what is really usable feels like the simplest solution, what do you think? [2] return p->mh_nbytes ------ diff --git a/lib/malloc/malloc.c b/lib/malloc/malloc.c index 439f8ef11af2..ad27d793b7f4 100644 --- a/lib/malloc/malloc.c +++ b/lib/malloc/malloc.c @@ -1272,8 +1272,6 @@ malloc_usable_size (mem) { register union mhead *p; register char *ap; - register int maxbytes; - if ((ap = (char *)mem) == 0) return 0; @@ -1286,13 +1284,7 @@ malloc_usable_size (mem) p = (union mhead *) ap - 1; } - /* XXX - should we return 0 if ISFREE? */ - maxbytes = binsize(p->mh_index); - - /* So the usable size is the maximum number of bytes in the bin less the - malloc overhead */ - maxbytes -= MOVERHEAD + MSLOP; - return (maxbytes); + return p->mh_nbytes; } #if !defined (NO_VALLOC) ------ Thanks, -- Dominique