[Bug c++/20099] New: -pthreads should imply -fno-threadsafe-statics
The '-fno-threadsafe-statics' is purely a pessimization for POSIX pthreads code. Since the POSIX standard already requires a mutex around operations that modify data, and the C++ standard specifies lazy initialization, POSIX code will already have mutexes to make static initialization thread safe. The code is, therefore, already thread safe and cannot be made more so. This option does allow non-compliant code to work by relaxing some of the requirements in the POSIX standard. That's a good option to have, but it should not be on by default. Code should, by default, get the benefit of the optimizations the standard allows. In summary, an option that is a pure pessimization for compliant code, and is only required to make non-compliant code work, should not be on by default when that standard is invoked. This is the case for other options in this category (writable strings, relaxed aliasing, and so on). So, '-pthreads' should imply '-fno-threadsafe-statics', and an option like '-fthreadsafe-statics' should be created to request this option. -- Summary: -pthreads should imply -fno-threadsafe-statics Product: gcc Version: 3.4.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: davids at webmaster dot com CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: i386-redhat-linux GCC host triplet: i386-redhat-linux GCC target triplet: i386-redhat-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-20 01:09 --- Do we agree that this is a pure pessimization for POSIX-compliant code? Do we agree that POSIX already requires a mutex to protect code that might modify an object? -- What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-20 02:46 --- In your example: class A { A(); int t; }; void f() { static A a; } I don't get it. What's the problem with this? Obviously, if you plan to call 'f()' from multiple threads, you must do it while holding a mutex because it might modify 'a'. This is just like any other function that modifies (or might modify) data. -- What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-20 02:50 --- You say, "how can someone know that A will modify memory"? The answer is, the C++ standard says so, section 7.1.2. They simply read that section of the standard, and they know that the function might modify memory. If they know for sure that it won't in a particular case, they are safe. If they don't know, and it might modify memory, then POSIX requires them to put in a mutex. You cannot create code that works with this option and doesn't work without it except by violating the POSIX standard. So POSIX code should not have this option enabled by default -- it's a pure pessimization. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-20 10:48 --- There is certainly the eternal argument whether a class should implement its own locks internally or whether the caller should implement them. The first case simplifies calling at the expense of overhead when you don't need it. The second makes callers have more knowledge of how the functions they're calling work, but make possible optimizations in cases where the locks aren't needed, and only the caller knows this. POSIX made this decision. The POSIX memory visibility rules specifically require a mutex to be held in cases where an object may be modified by another concurrent thread. So POSIX code *must* already contain such mutexes. This requirement in POSIX is as clear as anything can be. The rationale is that on some platforms, locks may be very expensive, so requiring them implicitly was carefully avoided. POSIX requires shared data to be explicitly locked by application code when data might be modified in one thread and read in another concurrently. While the C++ standard doesn't say anything about threads, it says a lot about how static objects are initialized. It specifies initialization on first use, thus first use is a modification. Use that might be first use might be a modification. There is no other sensible reading of the POSIX standard. There is no other sensible reading of the C++ standard. The arguments presented are disingenous. They would equally well defend the decision to initialize all static objects before calling 'main' in multithreaded code. After all, C++ doesn't say anything about multithreaded code and POSIX doesn't say anything about initializating C++ static objects. In fact, the arguments would defend crashing on any multithreaded C++ code, which is obviously not what anyone wants. If there is going to be C++/POSIX code, the C++ standard and the POSIX standard will have to be made to coexist. POSIX requires the application to figure out when objects may be accessed concurrently, and does so for good reason. In a non-POSIX context, it may make sense to have the compiler try to automatically insert mutexes, but it definitely doesn't in the POSIX case. The compiler should, by default, make the optimizations that the standards permit. This is the way every other similar option has been implemented. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-20 19:43 --- >The reasoning is that dynamically created objects are often used locally in one >thread, in which case locking would be unnecessary, while a singleton is always >accessible to all threads. Accessible, but not necessarily accessed. In fact, there is no way you can know whether this synchronization overhead is necessary, and for POSIX it's only needed if the code violates the memory visibility rules. >Static locals in C++ are an equivalent to pthread_once in C/POSIX. Even in the single-threaded case, C++ leaves it undefined what happens if you reenter a function that invokes a static initializer from that static initializer. To argue that this means it should be defined for the multi-threaded case is absurd. C++ requires the initialization to be complete before you are allowed to pass the intializer again. POSIX requires locks when data that might be shared might be modified. Bluntly, this is totally opposite to the entire philosophy of the POSIX standard and the wording of the C++ standard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-20 23:59 --- What you keep ignoring is that the POSIX standard explicitly declares it a bug to access data in one thread while it may be modified in another. It's not "unfortunate timing", it's failure to use the proper code to ensure correct timing. You say a hypothetical multithreaded C++ should state this as the semantics for static locals, and I don't disagree with you, provided we are not talking about one based on POSIX threads. POSIX specifically made the design decision to always require explicit locks and to always require the programmer to find, document, and lock cases where concurrent accesses might occur. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-23 01:30 --- The '-pthreads' flag should imply '-fno-threadsafe-statics'. For every other similar flag I can find, the default is to permit the compiler to make the optimizations that standard allows and specific flags are needed to disable the optimization. -- What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-23 02:19 --- > It's not the question of optimization but correctness. Exactly, and not locking objects that may be modified from another thread is not correct. Only the programmer knows whether an object may be modified from another thread. > Having an "initialize on first use" semantics which breaks when > multiple threads use it is as wrong as using global variables > for passing data between functions and forcing clients to > introduce locks to make them reentrant. It doesn't break when multiple threads use it, it breaks when multiple threads use it in a way where one thread might modify an object while another thread access it without locking. POSIX specifies that everything breaks when misused this way. You seem to have an attitude that it's difficult to track data dependencies and ensure correct locking. You talk about "unfortunate timing". What you're missing is that this is what people who write multithreaded programs do all day long. Those who are good at writing multithreaded programs are good at doing exactly this. And this change does not remove any of the burden -- it simply removes a choice (one can trivially add these locks when one knows they're needed or doesn't know they're not needed). > It is usable when > clients are careful, but it's a bad default when we can't > expect all calls to be done from a single thread. Any function that might be called from multiple threads concurrently will require special coding to deal with this situation. You would have to go out of your way to create a situation where just protecting the initialization is sufficient. You have to be careful when you write multithreaded code, period. Functions may require you to hold certain locks when you call them, prohibit you from holding other locks, or require the callers to impose synchronization. This is not some obscure detail unique to COFU, it's the bread and butter of multithreaded programming. POSIX puts the responsibility fully on the application programmer to place locks where there might be concurrent access. In exchange for this effort, the programmer gets the performance benefit of there not being locks when they are not needed. Some hypothetical multithreaded C++ standard might choose a different route. But it would not be POSIX. Again, it is this simple: POSIX prohibits an object from being modified in one thread and accessed in another. C++ specifies initialization on first use. Initialization is modification. Thus first use requires a lock if there can be a concurrent access. The language cannot tell when there can and cannot be a concurrent access, the language cannot tell when a call cannot possibly be first use. I could perhaps ignore all of this if the behavior discussed made a reasonably significant category of code "just work" or removed some of the work involved in getting synchronized code correct, but it does not. In the vast majority of cases, the static object may be modified and locks will be needed anyway. In many cases, it will be known that the static object has already been intialized and will not be modified, and the locks added will be wasted. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-24 20:42 --- >The only thing which would change when you remove the automatically >inserted locking is that some programs which used to work are now >broken, and that some other programs which used to deadlock now >invoke undefined behavior from entering an initializer recursively. First, if we're talking about pthreads programs, which is the only case I'm suggesting removing the locking for, then those programs are already broken. The pthreads standard requires locks when data may be changed in one thread and accessed in another. Entering an initializer recursively has always been undefined behavior. Any code that might do that is broken. In any event, the entire thrust of this argument is bogus. If GCC/G++ are going to have non-portable features that make code work when they're enabled and break when they're disabled, they definitely should not be on by default. That they hide bugs in code that claims standards compliance on one platform, but allow them to fail on another, is an argument *against* the option, not for it. (Or are you seriously arguing that the C++ standard and the POSIX standard *require* this behavior?) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-24 22:22 --- > They are non-portable no matter how static initializers are done: > C++ doesn't include threads and POSIX doesn't include C++. That's a bogus argument. There are no conflicts between the two standards. > Taking portability aside (as they are already non-portable), this is > a wonderful quote when taken out of context. Yeah, if an option makes > more code working and its negation makes more code break, let's make > the breaking variant the default :-) In any context, I stand by this quote. Code that does not conform to the standards it requests during compilation should break. Compatability options that make code work are fine, they just should not be on by default, especially when they have a performance cost. Broken code should break. > For me static locals in C++ are the equivalent of pthread_once in C/POSIX. > A hypothetical C++/POSIX should make them MT-safe. They are MT-safe, provided you lock shared data that may be accessed concurrently. POSIX never permits you to modify shared data in one thread when it may be accessed by another without a lock. By your definition of MT-safe, almost no function is MT-safe. Is 'strchr' MT-safe if you call it on a string while another function might modify that string? No, so lets put automatic locks around 'strchr' and any function that might modify a string. A function is MT-safe if it does the same thing in an MT context that it does in a UT context, provided the caller uses locks to prevent access to data in one thread while another thread might access it. Static initialization is already MT safe and these locks do not make it more so. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug c++/20099] -pthreads should imply -fno-threadsafe-statics
--- Additional Comments From davids at webmaster dot com 2005-02-24 23:45 --- > The difference is that almost all uses of strchr are done on data > which is not shared between threads, while many static local > initializers are used on objects which are accessed from multiple > threads (when multiple threads are used at all). In both cases, some are, some aren't. Some uses of 'strchr' are on data that's used by multiple threads. And some object intialized statically are in functions that are only called by a single thread. The cases are not as different as you seem to think they are. > By your reasoning malloc should be unsafe to use from multiple threads > if the user did not explicitly put a lock around it, because it modifies > shared data (the heap). The difference is that 'malloc' is a function supplied by the system and the user, in principle, has no knowledge of its internals. In principle, the user has no idea that there is a "heap" or that it is a shared structure. To support your argument, you would have to argue that if the user tried to implement his own memory allocation function, the language should somehow detect that it manipulates a shared structure (like the heap) and apply locks around that structure. > And errno should not be automatically made > thread-local, because it hurts performance in case thread-locality is not > needed. Again, to make your argument work, you would have to argue that the system should detect that new user-written code accesses some static variable akin to 'errno' and automatically change that variable to be thread local. > After all, "POSIX puts the responsibility fully on the application > programmer to place locks where there might be concurrent access. In > exchange for this effort, the programmer gets the performance benefit > of there not being locks when they are not needed". > Right? Right. User code should not get automatic locks put around it. Locks are needed where the programmer couldn't possibly know that shared data is manipulated, and the locks should be coded in the function, not magically added by the language. None of your cases even remotely support the argument that the implementation should detect possible cases where data is modified in one thread while it's used in another thread and automatically insert locks to make this access safe. (In the context of POSIX threads.) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20099
[Bug tree-optimization/35653] [4.3/4.4 Regression]: gcc-4.3 -O3/-ftree-vectorize regression: incorrect code generation
--- Comment #17 from davids at webmaster dot com 2008-03-26 22:56 --- I wonder why -Wcast-align doesn't generate a warning in this case. It would seem that would be what it's for. I get a segfault in the reduced test case and no warning from -Wcast-align. The documentation seems to specifically suggest that this is the case where a warning is generated. -- davids at webmaster dot com changed: What|Removed |Added CC| |davids at webmaster dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35653
[Bug c++/35730] New: ICE on valid code convert_move expr.c:373
The following code produces an ICE with GCC-4.3.0 when compiled with g++ and -O. Removing -O or using gcc removes the ICE. #include int moo(void) { unsigned char msg[] = { 0, 0 }; unsigned char data[2]; memcpy(data, msg, sizeof (msg)); return memcmp(data, msg, sizeof (data)) != 0; } Also, changing msg's declaration from "msg[]" to "msg[2]" avoids the ICE. g++430 test.c -c -O test.c: In function 'int moo()': test.c:6: internal compiler error: in convert_move, at expr.c:373 Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions Using gcc430, g++423, or g++412, the code compiles fine. Of the builds I tested, only g++ 4.3.0 has the problem. -- Summary: ICE on valid code convert_move expr.c:373 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: davids at webmaster dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35730
[Bug c++/35730] ICE on valid code convert_move expr.c:373
--- Comment #1 from davids at webmaster dot com 2008-03-28 09:24 --- #include int moo(void) { unsigned char msg1[] = { 0, 0 }; unsigned char msg2[] = { 0, 0 }; memcpy(msg1, msg2, 2); return memcmp(msg1, msg2, 2) != 0; } With this code, changing the sizes in both memcpy and memcmp from '2' to '1' makes the problem go away. So does changing both arrays from '{ 0, 0 }' to { 0, 0, 0 }'. So maybe it's an off-by-one? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35730
[Bug middle-end/31862] Loop IM and other optimizations harmful for -fopenmp
--- Comment #18 from davids at webmaster dot com 2009-02-25 16:06 --- This is a real bug. There is simply no way to write correct threaded code if the compiler is free to read a variable and write it back when the code didn't specifically tell it to. Optimizations on threaded code cannot add a write to a variable unless they can prove no other thread could know the location of the variable. (It's local, it's address has never been passed to a function, and so on.) In any event, the last time I looked into this, it seemed that such 'optimizations' were typically pessimizations on multi-threaded code anyway, as they added expensive cache misses. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31862
[Bug middle-end/31862] Loop IM and other optimizations harmful for -fopenmp
--- Comment #20 from davids at webmaster dot com 2009-02-25 17:53 --- I don't like this either: tmp = var; modified = false; for (i = 0; i < 100; i++) { if (i > x) tmp = i, modified = true; } if (modified) var = tmp; This can be a pessimization as well. If 'modified' is almost always false, the 'tmp = var;' can force a cache unshare for no reason. If this code is not performance critical, the optimization is pointless. If it is, the pessimization can be significant. IMO, these kinds of optimizations are not worth doing. Almost any optimization that can introduce an additional memory access risks a net performance loss in some use cases. The compiler cannot determine the use case by static code inspection. Now, in this case, the 'tmp = var;' is obviously superfluous, you can just eliminate it, so this isn't a good example of what I'm talking about. But in general, an optimization should not *add* a memory operation the code did not request unless you can show that it will always remove at least one operation of comparable cost. Otherwise it can be a net performance loss all the time in some use cases. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31862
[Bug middle-end/31862] Loop IM and other optimizations harmful for -fopenmp
--- Comment #23 from davids at webmaster dot com 2009-02-25 18:35 --- "Really to me this is still a valid transformation even in the inside threads. If you want to be able to access via different threads, use locks, heavy or light weight ones." Absolutely, you do use locks everywhere you write to a variable that might be read from by another thread. The problem is that the compiler introduces locks where you *don't* write to a variable. It is simply impossible to use locks if the compiler can add a write where you didn't expect one. The classic example: if(can_write) acquire_lock(); for(int i=0; i<100; i++) { some_stuff(i); if(can_write) shared_variable+=i; } if(can_write) release_lock(); Here this code does acquire a lock if it plans to modify the shared variable. But the optimizations discussed will break this code. Also, you can have the same problem with this kind of code without threads. Imagine, for example, if the 'shared_variable' may be in read-only memory and 'can_write' indicates this case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31862
[Bug tree-optimization/32500] Loop optimization limits range to size of array used inside loop
--- Comment #9 from davids at webmaster dot com 2007-06-26 03:42 --- This should be marked as a 4.2.0 regression. -- davids at webmaster dot com changed: What|Removed |Added CC||davids at webmaster dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32500
[Bug c++/13684] local static object variable constructed once but ctors and dtors called multiple times on same memory when called in multiple threads
--- Additional Comments From davids at webmaster dot com 2005-02-02 23:22 --- This is not a GCC bug and should not be fixed in GCC. The bug is in the test code which accesses an object that is shared by multiple threads without proper mutexes. Period. End of story. The correct fix is to acquire a mutex before calling a function that might accessed shared data (and constructing a static object counts). This should be common sense, and it's surprising to me that it is not. Anyone can create code that fails if you call functions that manipulate shared data without holding any mutexes. DS -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13684