On Thu, 2 Dec 2021 at 17:21, Jonathan Wakely via Libstdc++ <libstd...@gcc.gnu.org> wrote: > > Apart from "don't bother changing the COW string", does anybody see a > reason we shouldn't do this? This passes all tests for normal COW > strings and fully-dynamic COW strings.
Pushed to trunk. > > When non-const references, pointers or iterators are obtained to the > contents of a COW std::basic_string, the implementation has to assume it > could result in a write to the contents. If the string was previously > shared, it does the "copy-on-write" step of creating a new copy of the > data that is not shared by another object. It also marks the string as > "leaked", so that no future copies of it will share ownership either. > > However, if the string is empty then the only character in the sequence > is the terminating null, and modifying that is undefined behaviour. This > means that non-const references/pointers/iterators to an empty string > are effectively const. Since no direct modification is possible, there > is no need to "leak" the string, it can be safely shared with other > objects. This avoids unnecessary allocations to create new copies of > empty strings that can't be modified anyway. > > We already did this optimization for strings that share ownership of the > static _S_empty_rep() object, but not for strings that have non-zero > capacity, and not for fully-dynamic-strings (where the _S_empty_rep() > object is never used). > > With this change we avoid two allocations in the return statement: > > std::string s; > s.reserve(1); // allocate > std::string s2 = s; > std::string s3 = s; > return s[0] + s2[0] + s3[0]; // leak+allocate twice > > libstdc++-v3/ChangeLog: > > * include/bits/cow_string.h (basic_string::_M_leak_hard): Do not > reallocate an empty string. > --- > libstdc++-v3/include/bits/cow_string.h | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/libstdc++-v3/include/bits/cow_string.h > b/libstdc++-v3/include/bits/cow_string.h > index 389b39583e4..b21a7422246 100644 > --- a/libstdc++-v3/include/bits/cow_string.h > +++ b/libstdc++-v3/include/bits/cow_string.h > @@ -3366,10 +3366,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > basic_string<_CharT, _Traits, _Alloc>:: > _M_leak_hard() > { > -#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0 > - if (_M_rep() == &_S_empty_rep()) > + // No need to create a new copy of an empty string when a non-const > + // reference/pointer/iterator into it is obtained. Modifying the > + // trailing null character is undefined, so the ref/pointer/iterator > + // is effectively const anyway. > + if (this->empty()) > return; > -#endif > + > if (_M_rep()->_M_is_shared()) > _M_mutate(0, 0, 0); > _M_rep()->_M_set_leaked(); > -- > 2.31.1 >