On Thu, 2 Dec 2021 at 17:21, Jonathan Wakely via Libstdc++
<libstd...@gcc.gnu.org> wrote:
>
> Apart from "don't bother changing the COW string", does anybody see a
> reason we shouldn't do this? This passes all tests for normal COW
> strings and fully-dynamic COW strings.

Pushed to trunk.

>
> When non-const references, pointers or iterators are obtained to the
> contents of a COW std::basic_string, the implementation has to assume it
> could result in a write to the contents. If the string was previously
> shared, it does the "copy-on-write" step of creating a new copy of the
> data that is not shared by another object.  It also marks the string as
> "leaked", so that no future copies of it will share ownership either.
>
> However, if the string is empty then the only character in the sequence
> is the terminating null, and modifying that is undefined behaviour. This
> means that non-const references/pointers/iterators to an empty string
> are effectively const. Since no direct modification is possible, there
> is no need to "leak" the string, it can be safely shared with other
> objects. This avoids unnecessary allocations to create new copies of
> empty strings that can't be modified anyway.
>
> We already did this optimization for strings that share ownership of the
> static _S_empty_rep() object, but not for strings that have non-zero
> capacity, and not for fully-dynamic-strings (where the _S_empty_rep()
> object is never used).
>
> With this change we avoid two allocations in the return statement:
>
>   std::string s;
>   s.reserve(1);       // allocate
>   std::string s2 = s;
>   std::string s3 = s;
>   return s[0] + s2[0] + s3[0]; // leak+allocate twice
>
> libstdc++-v3/ChangeLog:
>
>         * include/bits/cow_string.h (basic_string::_M_leak_hard): Do not
>         reallocate an empty string.
> ---
>  libstdc++-v3/include/bits/cow_string.h | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/cow_string.h 
> b/libstdc++-v3/include/bits/cow_string.h
> index 389b39583e4..b21a7422246 100644
> --- a/libstdc++-v3/include/bits/cow_string.h
> +++ b/libstdc++-v3/include/bits/cow_string.h
> @@ -3366,10 +3366,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>      basic_string<_CharT, _Traits, _Alloc>::
>      _M_leak_hard()
>      {
> -#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0
> -      if (_M_rep() == &_S_empty_rep())
> +      // No need to create a new copy of an empty string when a non-const
> +      // reference/pointer/iterator into it is obtained. Modifying the
> +      // trailing null character is undefined, so the ref/pointer/iterator
> +      // is effectively const anyway.
> +      if (this->empty())
>         return;
> -#endif
> +
>        if (_M_rep()->_M_is_shared())
>         _M_mutate(0, 0, 0);
>        _M_rep()->_M_set_leaked();
> --
> 2.31.1
>

Reply via email to