https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942
--- Comment #12 from Jonathan Wakely <redi at gcc dot gnu.org> --- (In reply to Alexander Monakov from comment #5) > The main gotcha here is m_b_r does not allocate on construction, but rather > allocates 2x of the preallocation size on first call to 'allocate', and then > deallocates when 'release' is called. So it repeatedly calls malloc/free in > the inner benchmark loop, whereas you custom allocator allocates on > construction and deallocates on destruction, avoiding repeated malloc/free > calls in the loop and associated lock contention when multithreaded. m_b_r really needs a "rewind()" member to mark all allocated memory as unused again, without actually deallocating it and returning it to the upstream resource.