http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257
Summary: std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data() Product: gcc Version: 4.1.2 Status: UNCONFIRMED Severity: minor Priority: P3 Component: libstdc++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: mohsinrza...@gmail.com The output of the following program is: 4 4 #include <string> #include <iostream> using namespace std; int main() { try { string emptyStr1; string emptyStr2; string emptyStr3; emptyStr3.assign(emptyStr2.data(), 4); cout << emptyStr1.size() << endl; cout << emptyStr3.size() << endl; } catch(...) { cout << "Reached here" << endl; } } The size of emptyStr1/3 should not have been modified when emptyStr3 was assigned the contents of emptyStr2 up to 4 bytes although all string are empty. >From what I've understood of the basic_string code, the following code flow will result in this case: <snip> template<typename _CharT, typename _Traits, typename _Alloc> inline basic_string<_CharT, _Traits, _Alloc>:: basic_string() #ifndef _GLIBCXX_FULLY_DYNAMIC_STRING : _M_dataplus(_S_empty_rep()._M_refdata(), _Alloc()) { } #else : _M_dataplus(_S_construct(size_type(), _CharT(), _Alloc()), _Alloc()) { } #endif </snip> First, the default constructor. Since I do not have "_GLIBCXX_FULLY_DYNAMIC_STRING" defined in my implementation, _M_dataplus._M_p will get initialized to &(_S_empty_rep_storage + sizeof (_Rep)) for all three strings. <snip> const _CharT* data() const { return _M_data(); } </snip> <snip> _CharT* _M_data() const { return _M_dataplus._M_p; } </snip> <snip> template<typename _CharT, typename _Traits, typename _Alloc> basic_string<_CharT, _Traits, _Alloc>& basic_string<_CharT, _Traits, _Alloc>:: assign(const _CharT* __s, size_type __n) { __glibcxx_requires_string_len(__s, __n); _M_check_length(this->size(), __n, "basic_string::assign"); if (_M_disjunct(__s) || _M_rep()->_M_is_shared()) return _M_replace_safe(size_type(0), this->size(), __s, __n); else { // Work in-place. const size_type __pos = __s - _M_data(); if (__pos >= __n) _M_copy(_M_data(), __s, __n); else if (__pos) _M_move(_M_data(), __s, __n); _M_rep()->_M_set_length_and_sharable(__n); return *this; } } </snip> When assign is called using emptyStr2.data(), which returns the same value as that set above for _M_dataplus._M_p, _M_disjunct() will return 0, as will _M_is_shared() and we end up going into the else block. Here, since __s = _M_dataplus._M_p, which is what _M_data() returns, __pos = 0 and so the only statements we end up executing are the last two in the else block thereby setting the length = 4 in the static storage returned by _M_rep(). This results in the corruption of static storage used by all std::string objects. Does that make sense? ========================== Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux Thread model: posix gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)