https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90436
--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
(writing down some notes)
Calling
size_type
_M_check_len_one(const char* __s) const
{
if (max_size() - size() < 1)
__throw_length_error(__N(__s));
const size_type __len = size() + (std::max)(size(), (size_t)1);
return (__len > max_size()) ? max_size() : __len;
}
instead of _M_check_len reduces the running time of this micro-benchmark
#include <vector>
int main(){
volatile int a=0;
for(int i=0;i<1000000;++i){
std::vector<int> v;
for(int j=0;j<1000;++j){
v.push_back(j);
}
a=v[a];
}
}
from .88s to .66s at -O3. Two key elements (the perf gain only comes if we do
both) are removing the overflow check, and having the comparison between size
and max_size optimized to be done on byte length (not divided by the element
size).
I think the overflow check could be removed from the normal _M_check_len: we
have already checked that max_size() - size() >= __n so size() + __n cannot
overflow, and size() must be smaller than max_size(), which should be at most
SIZE_MAX/2, at least if ptrdiff_t and size_t have the same size, so size() +
size() cannot overflow either.
I should check if the compiler could help more. It is supposed to know how to
optimize .ADD_OVERFLOW based on the range of the operands.
I suspect that a single_use restriction explains why max_size() == size()
compares values without division while max_size() - size() < __n (for __n = 1)
doesn't.