https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119037

            Bug ID: 119037
           Summary: Incorrect calculations of max_size involving
                    basic_strings
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: luigighiron at gmail dot com
  Target Milestone: ---

The following example demonstrates situations where the max_size of a
basic_string is calculated incorrectly in libstdc++ (compiled with C++20 or
newer):

#include<memory>
#include<string>
#include<iostream>
template<typename T>struct A:std::allocator<T>{
    A::allocator::size_type max_size()const noexcept{
        return 1;
    }
};
template<typename T>struct B:std::allocator<T>{
    B::allocator::size_type max_size()const noexcept{
        return 0;
    }
};
int main(){
    //issue one
    {
        std::basic_string<char,std::char_traits<char>,A<char>>s;
        s.push_back('a');
        s.push_back('b');
        std::cout<<s.size()<<' '<<s.capacity()<<' '<<s.max_size()<<'\n';
        std::cout<<std::boolalpha<<(s.size()>s.max_size())<<'\n';
    }
    //issue two
    {
        std::basic_string<char,std::char_traits<char>,B<char>>s;
        std::cout<<s.max_size()<<'\n';
        std::cout<<std::string().max_size()<<'\n';
    }
}

Issue one: First, an empty string is created using the allocator A which has a
max_size of one. Then, two characters 'a' and 'b' are inserted into this
string. Obviously, after these two insertions the size of the string is two so
that is printed. This string is stored using small string optimization,
internally in the string there is an array of sixteen characters so the
capacity is fifteen (one less because of the zero terminator) which is printed.
The third number printed is the max_size of the string, which is zero.
Therefore, the size of the string exceeds the max_size of the string and true
is printed. This should be impossible, in fact in libstdc++ this code has
undefined behavior here during the size call when size should normally have no
preconditions:

> _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> size_type
> size() const _GLIBCXX_NOEXCEPT
> {
>   size_type __sz = _M_string_length;
>   if (__sz > max_size ())
>     __builtin_unreachable ();
>   return __sz;
> }
Issue two: An empty string is created using the allocator B which has a
max_size of zero. Then, the max_size of this string is printed which is the
maximum value representable in size_type. Lastly, the max_size of an empty
string created using the default allocator (std::allocator<char>) is printed.
The max_size of the string created using the allocator B is greater than the
max_size of the string created using the default allocator, which seems very
unintended. The issue in this case is that the max_size of the string created
using the allocator B is not representable in the difference_type, which is
what the max_size of the string created using the default allocator avoids.

> _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> size_type
> max_size() const _GLIBCXX_NOEXCEPT
> {
>   const size_t __diffmax
>     = __gnu_cxx::__numeric_traits<ptrdiff_t>::__max / sizeof(_CharT);
>   const size_t __allocmax = _Alloc_traits::max_size(_M_get_allocator());
>   return (std::min)(__diffmax, __allocmax) - 1;
> }
This is how max_size is defined, and the two issues are present here. Issue one
is caused by this definition not considering small string optimization. Issue
two is caused by the subtraction of one not considering that max_size of the
allocator can return zero, which causes wraparound to the maximum value.
Additionally, it looks like __allocmax being of the type size_t instead of
size_type could cause the result to be incorrect if max_size returned a value
greater than SIZE_MAX (causing unintended wraparound, saturating is probably
intended). This last issue is impossible to demonstrate on x86_64, so I didn't
include it in the example.

Theoretically, unintended wraparound could also happen in __diffmax if
PTRDIFF_MAX/sizeof(_CharT) is greater than SIZE_MAX and sizeof(_CharT) is not a
positive integer power of two. However, I'm not sure any targets exist where
this combination can arise so it's probably not worth considering (assuming
_CharT is a character type, other types for _CharT should be very rare).

Reply via email to