[Bug libstdc++/65033] New: C++11 atomics: is_lock_free result does not always match the real lock-free property
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65033 Bug ID: 65033 Summary: C++11 atomics: is_lock_free result does not always match the real lock-free property Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: bin.x.fan at oracle dot com Hi, The is_lock_free result for an object of type atomic, where s3_t is size=3, alignment=1 C style struct, does not always match the implementation in libatomic.so for atomic operations on this object. I think there is either a bug in the g++ header and the g++ 4.9.2 implementation is not C++11 standard conforming, or there is a bug in libatomic.so. Here is the source code -bash-4.1$ cat struct3.cc #include #include using namespace std; #define N 10 struct s3_t { char a[3]; }; atomic array[N]; s3_t obj; int main() { int i; for (i=0;i[1] libat_lock_n(ptr = 0x216c6, n = 3U), line 64 in "lock.c" [2] libat_store(n = 3U, mptr = 0x216c6, vptr = 0xffbff580, smodel = 5), line 100 in "gstore.c" [3] std::atomic::store(this = 0x216c6, __i = STRUCT, _m = memory_order_seq_cst), line 199 in "atomic" [4] std::atomic_store_explicit(__a = 0x216c6, __i = STRUCT, __m = memory_order_seq_cst), line 828 in "atomic" [5] std::atomic_store(__a = 0x216c6, __i = STRUCT), line 895 in "atomic" [6] main(), line 22 in "struct3.cc" So one of the following two things could be happening here 1. g++ makes lock-free property per-object, which is not C++11 standard conforming, and report it incorrectly with atomic_is_lock_free, or 2. g++ tries to make lock-free property per-type, but the libatomic.so implementation does not match. Also, without changing the alignment, I doubt that size=3 alignment=1 atomic object can always be lock-free on SPARC or x86.
[Bug libstdc++/65033] C++11 atomics: is_lock_free result does not always match the real lock-free property
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65033 --- Comment #5 from Bin Fan --- (In reply to Jason Merrill from comment #3) > (In reply to Bin Fan from comment #0) > > 2. g++ tries to make lock-free property per-type, but the libatomic.so > > implementation does not match. > > This. We always pass a null pointer to libatomic and do not pass any > information about the alignment of the type. rth suggested that we might > try passing a fake, minimally-aligned pointer instead of null as a way of > communicating the alignment without adding a new entry point. So after the fix, atomic_is_lock_free will always return 0 for size=3,align=1 atomic struct objects? I understand currently libatomic tries to make an atomic object lock-free if its memory location fit in a certain sized window. So for atomic operations such as atomic_store where the actual address is passed in, the operation can be still either lock-free or locked, right? I'm wondering if it's standard conforming since the lock-free property is still per-object, or it can be seen as an optimization, i.e. atomic_is_lock_free query for the object returns 0, but atomic operations on the object could be lock-free.
[Bug c/65083] New: Can not indirectly call some C11 atomic library functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65083 Bug ID: 65083 Summary: Can not indirectly call some C11 atomic library functions Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: bin.x.fan at oracle dot com C11 defines these as actual functions, not generic functions or macros: atomic_thread_fence atomic_signal_fence atomic_flag_test_and_set atomic_flag_test_and_set_explicit atomic_flag_clear atomic_flag_clear_explicit User should be able to take their address and call them indirectly. However, GCC does not provide definitions of these functions in libatomic.so, so GCC does not allow the user to take the address of these functions. Here is an example: -bash-4.1$ gcc -v Using built-in specs. COLLECT_GCC=/net/dv104/export/tools/gcc/4.9.2/sparc-S2/bin/gcc.bin COLLECT_LTO_WRAPPER=/net/dv104/export/tools/gcc/4.9.2/sparc-S2/libexec/gcc/sparc-sun-solaris2.10/4.9.2/lto-wrapper Target: sparc-sun-solaris2.10 Configured with: ../gcc-4.9.2/configure --prefix=/net/dv104/export/tools/gcc/4.9.2/sparc-S2 --enable-languages=c,c++,fortran --with-gmp=/net/dv104/export/tools/gcc/4.9.2/sparc-S2 --with-mpfr=/net/dv104/export/tools/gcc/4.9.2/sparc-S2 --with-mpc=/net/dv104/export/tools/gcc/4.9.2/sparc-S2 Thread model: posix gcc version 4.9.2 (GCC) -bash-4.1$ cat t.c #include void (*func_ptr) (memory_order order); int main() { func_ptr = &atomic_thread_fence; (*func_ptr)(memory_order_seq_cst); return 0; } -bash-4.1$ gcc t.c -latomic t.c: In function 'main': t.c:5:15: error: 'atomic_thread_fence' undeclared (first use in this function) func_ptr = &atomic_thread_fence; ^ t.c:5:15: note: each undeclared identifier is reported only once for each function it appears in
[Bug libstdc++/66842] New: libatomic uses multiple locks for locked atomics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66842 Bug ID: 66842 Summary: libatomic uses multiple locks for locked atomics Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: bin.x.fan at oracle dot com Target Milestone: --- Hi GCC folks, I'm opening this bug to report an issue that may or may not be a real bug. I notice that GCC libatomic uses multiple locks for a locked atomic object whose size is greater than 64 bytes. The granularity seems to be 64 because for every 64 bytes added to the size, one more lock is added. It seems that this is to protect overlapping locked atomic object. If locked atomic objects never overlap, then a more efficient way to do locked atomic operations would be each object being protected by just one lock that is hashed from its address. Accessing a member of an atomic struct object is undefined behavior in C11 standard. So, does GCC support it as an extension or using multiple locks is unnecessary therefore it’s a performance bug? Here is my code to illustrate the issue. I interpose pthread_mutex_lock to count how many times it is called. My GCC version is 4.9.2, and its target is x86_64-unknown-linux-gnu. The libatomic.so I use comes with the GCC 4.9.2 installation. -bash-4.2$ cat libmythread.c #define _GNU_SOURCE #include #include #include #include static int counter = 0; int pthread_mutex_lock (pthread_mutex_t *mutex) { static int (*real_pthread_mutex_lock)(pthread_mutex_t *) = NULL; if (real_pthread_mutex_lock == NULL) { real_pthread_mutex_lock = dlsym (RTLD_NEXT, "pthread_mutex_lock"); } assert (real_pthread_mutex_lock); counter++; return real_pthread_mutex_lock (mutex); } void display_nlocks () { printf ("pthread_mutex_lock is called %d times\n", counter); return; } -bash-4.2$ cat c11_locked_atomics.c #include #ifndef SIZE #define SIZE 1024 #endif typedef struct { char a[SIZE]; } lock_obj_t; extern void display_nlocks (); int main() { lock_obj_t v2 = {0}; _Atomic lock_obj_t v1 = ATOMIC_VAR_INIT(v2); v2 = atomic_load (&v1); display_nlocks (); return 0; } -bash-4.2$ gcc -shared -ldl -fPIC libmythread.c -o libmythread.so -bash-4.2$ gcc -latomic c11_locked_atomics.c -DSIZE=2048 -L./ -Wl,-rpath=./ -lmythread -bash-4.2$ LD_PRELOAD=./libmythread.so a.out pthread_mutex_lock is called 32 times
[Bug c/66842] libatomic uses multiple locks for locked atomics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66842 --- Comment #2 from Bin Fan --- I couldn't find a category for libatomic, and my understand is that C and C++ share libatomic library. (In reply to Jonathan Wakely from comment #1) > This obviously isn't a libstdc++ bug because you're not even using C++!
[Bug c++/66842] libatomic uses multiple locks for locked atomics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66842 Bin Fan changed: What|Removed |Added Component|c |c++ --- Comment #4 from Bin Fan --- Since I don't see any response from C so far, I change the example to C++ code, and change the category to c++. Could C++ folks take a look? -bash-4.2$ cat c++11_locked_atomics.cpp #include using namespace std; #ifndef SIZE #define SIZE 1024 #endif typedef struct { char a[SIZE]; } lock_obj_t; extern "C" { extern void display_nlocks (); } int main() { lock_obj_t v2 = {0}; atomic v1 = ATOMIC_VAR_INIT(v2); v2 = atomic_load (&v1); display_nlocks (); return 0; } gcc -shared -ldl -fPIC libmythread.c -o libmythread.so -g g++ -std=c++11 -latomic c++11_locked_atomics.cpp -DSIZE=2048 -g -L./ -Wl,-rpath=./ -lmythread + LD_PRELOAD=./libmythread.so + a.out pthread_mutex_lock is called 32 times The g++ version is still 4.9.2.
[Bug libstdc++/65033] C++11 atomics: is_lock_free result does not always match the real lock-free property
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65033 --- Comment #9 from Bin Fan --- I verified this bug is fixed in 5.1.0. However, it is only fixed in g++, so now in 5.1.0, gcc and g++ reports different result: -bash-4.1$ cat is_lock_free.c #include #include #define N 10 typedef struct { char a[3]; } s3_t; _Atomic s3_t array[N]; s3_t obj; int main() { int i; for (i=0;i #include using namespace std; #define N 10 struct s3_t { char a[3]; }; atomic array[N]; s3_t obj; int main() { int i; for (i=0;i
[Bug c++/66842] libatomic uses multiple locks for locked atomics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66842 --- Comment #6 from Bin Fan --- (In reply to Richard Henderson from comment #5) > When libatomic was first written, it wasn't clear what the restrictions > from the various languages would be, nor even if that was the best of > ideas -- things that would Just Work lock-free would fail on other, > less popular platforms. > > Thus libatomic is written such that accesses to the same object, via > different aliased pages, will work. Could you clarify what does aliased pages mean? Do you mean the same object is mapped into two or more different processes with different virtual addresses? And the locks in libatomic are also shared by the processes? Or something else? > Thus locks are created on a per-cacheline basis covering one page. This make sense if the above understand of aliased pages is true. However, what if the memory is not mapped at page boundaries? Then the object may have different page offset therefore it is still protected by different locks. And this does not explain why a locked object is protected by multiple locks. If memory is always mapped at edge boundaries, then the offset of the object in the page will always be the same so one lock should work. If memory is not mapped at page boundaries, then if an object is mapped into two "non-overlapped" address space inside a page, multiple locks would still don't work. > > This does lead to inefficiencies wrt a more straight-forward solution, > but very careful thought needs to go into changing it. Besides aliased pages, does libatomic consider supporting nested locked atomic objects? For example, should the following work? typedef struct { _Atomic locked1_t obj1; /* other fields */ } locked2_t; _Atomic locked2_t obj2; atomic_store(&obj2, ...) atomic_load(&obj2.obj1, ...)