[Bug libstdc++/48257] New: std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257 Summary: std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data() Product: gcc Version: 4.1.2 Status: UNCONFIRMED Severity: minor Priority: P3 Component: libstdc++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: mohsinrza...@gmail.com The output of the following program is: 4 4 #include #include using namespace std; int main() { try { string emptyStr1; string emptyStr2; string emptyStr3; emptyStr3.assign(emptyStr2.data(), 4); cout << emptyStr1.size() << endl; cout << emptyStr3.size() << endl; } catch(...) { cout << "Reached here" << endl; } } The size of emptyStr1/3 should not have been modified when emptyStr3 was assigned the contents of emptyStr2 up to 4 bytes although all string are empty. >From what I've understood of the basic_string code, the following code flow will result in this case: template inline basic_string<_CharT, _Traits, _Alloc>:: basic_string() #ifndef _GLIBCXX_FULLY_DYNAMIC_STRING : _M_dataplus(_S_empty_rep()._M_refdata(), _Alloc()) { } #else : _M_dataplus(_S_construct(size_type(), _CharT(), _Alloc()), _Alloc()) { } #endif First, the default constructor. Since I do not have "_GLIBCXX_FULLY_DYNAMIC_STRING" defined in my implementation, _M_dataplus._M_p will get initialized to &(_S_empty_rep_storage + sizeof (_Rep)) for all three strings. const _CharT* data() const { return _M_data(); } _CharT* _M_data() const { return _M_dataplus._M_p; } template basic_string<_CharT, _Traits, _Alloc>& basic_string<_CharT, _Traits, _Alloc>:: assign(const _CharT* __s, size_type __n) { __glibcxx_requires_string_len(__s, __n); _M_check_length(this->size(), __n, "basic_string::assign"); if (_M_disjunct(__s) || _M_rep()->_M_is_shared()) return _M_replace_safe(size_type(0), this->size(), __s, __n); else { // Work in-place. const size_type __pos = __s - _M_data(); if (__pos >= __n) _M_copy(_M_data(), __s, __n); else if (__pos) _M_move(_M_data(), __s, __n); _M_rep()->_M_set_length_and_sharable(__n); return *this; } } When assign is called using emptyStr2.data(), which returns the same value as that set above for _M_dataplus._M_p, _M_disjunct() will return 0, as will _M_is_shared() and we end up going into the else block. Here, since __s = _M_dataplus._M_p, which is what _M_data() returns, __pos = 0 and so the only statements we end up executing are the last two in the else block thereby setting the length = 4 in the static storage returned by _M_rep(). This results in the corruption of static storage used by all std::string objects. Does that make sense? == Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux Thread model: posix gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)
[Bug libstdc++/48257] std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257 Mohsin changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | --- Comment #2 from Mohsin 2011-03-23 19:09:31 UTC --- (In reply to comment #1) > this is undefined behaviour, the standard requires that emptyStr2.data() > points > to an array of at least 4 chars Kindly point me to where this is stated in the standards. I have searched but have not been able to find any information in this regard.
[Bug libstdc++/48257] std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257 Mohsin changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | --- Comment #4 from Mohsin 2011-03-24 06:19:52 UTC --- (In reply to comment #3) > In C++03 see 21.3.5.3 [lib.string.assign] paragraph 6 where > string::assign(const char* s, size_t n) is defined to return > assign( string(s, n) ) and that constructor: > > Constructs an object of class basic_string and determines its initial string > value from the array of charT of length n whose first element is designated by > s, > > In C++0x the requirement is explicit, [string::assign] paragraph 8 says > > Requires: s points to an array of at least n elements of charT. I believe this is what you were referring to (from N3242=11-0012): basic_string& assign(const charT* s, size_type n); 8 Requires: s points to an array of at least n elements of charT. 9 Throws: length_error if n > max_size(). 10 Effects: Replaces the string controlled by *this with a string of length n whose elements are a copy of those pointed to by s. 11 Returns: *this. Thanks for pointing this out. However, does libstdc++ throw and error if "n > max_size()"? I don't think it does. >From the latest libstdc++ documentation: template basic_string< _CharT, _Traits, _Alloc > & std::basic_string< _CharT, _Traits, _Alloc >::assign ( const _CharT * __s, size_type __n ) Set value to a C substring. Parameters: s The C string to use. n Number of characters to use. Returns: Reference to this string. This function sets the value of this string to the first n characters of s. If n is is larger than the number of available characters in s, the remainder of s is used. Definition at line 261 of file basic_string.tcc. References std::basic_string< _CharT, _Traits, _Alloc >::size(). I don't see this as saying that the string has to have at least n elements OR that an error will be thrown if it isn't. >From the latest basic_string code: template 00259 basic_string<_CharT, _Traits, _Alloc>& 00260 basic_string<_CharT, _Traits, _Alloc>:: 00261 assign(const _CharT* __s, size_type __n) 00262 { 00263 __glibcxx_requires_string_len(__s, __n); 00264 _M_check_length(this->size(), __n, "basic_string::assign"); 00265 if (_M_disjunct(__s) || _M_rep()->_M_is_shared()) 00266 return _M_replace_safe(size_type(0), this->size(), __s, __n); 00267 else 00268 { 00269 // Work in-place. 00270 const size_type __pos = __s - _M_data(); 00271 if (__pos >= __n) 00272 _M_copy(_M_data(), __s, __n); 00273 else if (__pos) 00274 _M_move(_M_data(), __s, __n); 00275 _M_rep()->_M_set_length_and_sharable(__n); 00276 return *this; 00277 } 00278 } I don't see any errors being thrown. I am not trying to nit-pick but I've raised this issue because I was affected by it and had to spend several days isolating it. It would have made my life a whole lot easier if, as you claim, libstdc++ had thrown an error. Just a request, but please let the discussion complete before you go ahead and mark the bug as Resolved-Invalid. I don't mind it being marked invalid, but I consider it disrespectful for you to not even wait for my response and rushing to mark it as invalid. Once we're through discussing, I will mark it invalid myself if necessary.
[Bug libstdc++/48257] std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257 --- Comment #8 from Mohsin 2011-03-24 09:50:11 UTC --- (In reply to comment #6) > Also, GCC 4.1.2 is ancient and not supported here, you should either report > bugs to the vendor of your version (Red Hat) or refer to a supported version. > In my last comment, I did not post snippets to old code - I posted them off the latest code. > As for being disrespectful, the bug was invalid and I know that for a fact, > which is why I didn't wait for you to agree. You don't need to reopen the bug > in order to post a new comment if you'd like pointers to the relevant pieces > of > the standard or the code. Agreed.
[Bug libstdc++/48257] std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257 --- Comment #9 from Mohsin 2011-03-24 09:52:56 UTC --- (In reply to comment #7) > (In reply to comment #4) > > > > I don't see any errors being thrown. I am not trying to nit-pick but I've > > raised this issue because I was affected by it and had to spend several days > > isolating it. It would have made my life a whole lot easier if, as you > > claim, > > libstdc++ had thrown an error. > > I never claimed there'd be an exception throw in your original example, I said > it's undefined behaviour. An exception is thrown if n > max_size() > Agreed. > > Just a request, but please let the discussion complete before you go ahead > > and > > mark the bug as Resolved-Invalid. I don't mind it being marked invalid, but > > I > > consider it disrespectful for you to not even wait for my response and > > rushing > > to mark it as invalid. Once we're through discussing, I will mark it invalid > > myself if necessary. > > That's not how it works, most users do not close their own bugs, maintainers > don't want to keep revisiting old bugs to see if the user has decided to > accept > the resolution yet. Agreed.(In reply to comment #7) > (In reply to comment #4) > > > > I don't see any errors being thrown. I am not trying to nit-pick but I've > > raised this issue because I was affected by it and had to spend several days > > isolating it. It would have made my life a whole lot easier if, as you > > claim, > > libstdc++ had thrown an error. > > I never claimed there'd be an exception throw in your original example, I said > it's undefined behaviour. An exception is thrown if n > max_size() > > > Just a request, but please let the discussion complete before you go ahead > > and > > mark the bug as Resolved-Invalid. I don't mind it being marked invalid, but > > I > > consider it disrespectful for you to not even wait for my response and > > rushing > > to mark it as invalid. Once we're through discussing, I will mark it invalid > > myself if necessary. > > That's not how it works, most users do not close their own bugs, maintainers > don't want to keep revisiting old bugs to see if the user has decided to > accept > the resolution yet. Agreed, thanks.
[Bug libstdc++/48257] std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257 --- Comment #10 from Mohsin 2011-03-24 10:21:56 UTC --- Two questions here: 1. Is the behaviour undefined for __n < number of elements in __s? 2. For cases undefined in the specs, do we take steps to ensure robustness? I still cannot digest that a programmer error could corrupt std::string static memory.
[Bug libstdc++/48257] std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257 --- Comment #11 from Mohsin 2011-03-24 10:24:15 UTC --- (In reply to comment #10) > Two questions here: > > 1. Is the behaviour undefined for __n < number of elements in __s? > Oops! I meant for __n > number of elements in __s. > > 2. For cases undefined in the specs, do we take steps to ensure robustness? I > still cannot digest that a programmer error could corrupt std::string static > memory.
[Bug libstdc++/48257] std::string::assign() corrupts std::string static data when called on emptyString1 using emptyString2.data()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48257 --- Comment #13 from Mohsin 2011-03-25 05:48:21 UTC --- (In reply to comment #12) > (In reply to comment #11) > > (In reply to comment #10) > > > Two questions here: > > > > > > 1. Is the behaviour undefined for __n < number of elements in __s? > > > > > > > Oops! I meant for __n > number of elements in __s. > Yes! The standard defines the behaviour of string::assign when s contains at > least n elements. When that precondition isn't met the definition doesn't > apply > i.e. it's undefined. Sorry for the confusion here Jon. I meant to ask if the specs define what the behaviour should be if __s *does not* contain __n elements. > > > 2. For cases undefined in the specs, do we take steps to ensure > > > robustness? > Where possible, yes, that's what -D_GLIBCXX_DEBUG tries to do. But in general > it's not possible to verify that the supplied string meets the required > length. > Given a const char*, how do you tell if it points to an array of at least n > chars? You can't. We could always look for the null-termination :) But like you say below, this would add overhead and still not handle all cases. > It would be possible to check that the supplied const char* is not the shared > static "empty string" representation, but that would add overhead and still > wouldn't prevent similar errors like: > string s("oops", 999); Understood. > > > I > > > still cannot digest that a programmer error could corrupt std::string > > > static > > > memory. > Really? A sufficiently malicious/careless programmer can corrupt pretty much > anything! Hehe, understood. > In any case, if you try your original example with current releases it doesn't > print '4' Great! I'll try that. Thanks for bearing with my impetuous behaviour. I know it mustn't be easy maintaining such a widely used code base.