https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70178
Bug ID: 70178 Summary: Loop-invariant memory loads from std::string innards are not hoisted Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: zackw at panix dot com Target Milestone: --- Consider #include <string> #include <algorithm> #include <iterator> using std::string; inline unsigned char hexval(unsigned char c) { if (c >= '0' && c <= '9') return c - '0'; else if (c >= 'A' && c <= 'F') return c - 'A' + 10; else if (c >= 'a' && c <= 'f') return c - 'a' + 10; else throw "input character not a hexadecimal digit"; } void hex2ascii_1(const string& in, string& out) { size_t inlen = in.length(); if (inlen % 2 != 0) throw "input length not a multiple of 2"; out.clear(); out.reserve(inlen / 2); for (string::const_iterator p = in.begin(); p != in.end(); p++) { unsigned char c = hexval(*p); p++; c = (c << 4) + hexval(*p); out.push_back(c); } } void hex2ascii_2(const string& in, string& out) { size_t inlen = in.length(); if (inlen % 2 != 0) throw "input length not a multiple of 2"; out.clear(); out.reserve(inlen / 2); std::transform(in.begin(), in.end() - 1, in.begin() + 1, std::back_inserter(out), [](unsigned char a, unsigned char b) { return (hexval(a) << 4) + hexval(b); }); } It seems to me that both of these should be optimizable to the equivalent thing you would write in C, with all the pointers in registers ... void hex2ascii_hypothetical(const string& in, string& out) { size_t inlen = in.length(); if (inlen % 2 != 0) throw "input length not a multiple of 2"; out.clear(); out.reserve(inlen / 2); const unsigned char *p = in._M_data(); const unsigned char *limit = p + in._M_length(); unsigned char *q = out._M_data(); // (check for pointer wrap-around here?) while (p < limit) { *q++ = (hexval(p[0]) << 4) + hexval(p[1]); p += 2; } } Maybe it wouldn't be able to deduce that capacity overflow is impossible by construction, but that's a detail. The important thing is that g++ 5 and 6 cannot hoist the pointer initializations out of the loop as shown. They reload p, limit, and q from memory (that is, from the relevant string objects) on every iteration.