Hello, on s390x the 128-bit integer type is only aligned to 8 bytes by default, but when lock-free atomic operations can only be performed on objects aligned to 16 bytes. However, we've noticed that GCC sometimes falls back to library calls *even if* the object is actually 16 byte aligned, and GCC could easily know this from type alignment information.
However, it turns out that get_builtin_sync_mem *ignores* type alignment data, and only looks at *mode* alignment and pointer alignment data. This is a problem since mode alignment doesn't know about user-provided alignment attributes, and while pointer alignment does sometimes know about those, this usually only happens with optimization on and also if the optimizers can trace pointer assignments to the final object. One important use case where the latter does not happen is with the C++ atomic classes, because here the object is accessed via the "this" pointer. Pointer alignment tracking at this point never knows the final object, and thus we always get a library call *even though* libstdc++ has actually marked the class type with the correct alignment atttribute. Now one question might be, why does get_pointer_alignment not take type alignment into account by itself? This appears to be deliberate to avoid considering numeric pointer values to be aligned when they are not for target-specific reasons (e.g. the low bit that indicates Thumb on ARM). However, this is not an issue in get_builtin_sync_mem, where we are actually interested in the alignment of the MEM we're just about to generate, so it should be fine to check type alignment here. This patch does just that, fixing the issue we've been seeing. Tested on s390x-ibm-linux. OK for mainline? Bye, Ulrich ChangeLog: * builtins.c (get_builtin_sync_mem): Respect type alignment. testsuite/ChangeLog: * gcc.target/s390/md/atomic_exchange-1.c: Do not use -latomic. (aligned_int128): New data type. (ae_128_0): Use it instead of __int128 to ensure proper alignment for atomic accesses. (ae_128_1): Likewise. (g128): Likewise. (main): Likewise. Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 274142) +++ gcc/builtins.c (working copy) @@ -6001,9 +6001,16 @@ mem = validize_mem (mem); - /* The alignment needs to be at least according to that of the mode. */ - set_mem_align (mem, MAX (GET_MODE_ALIGNMENT (mode), - get_pointer_alignment (loc))); + /* The alignment needs to be at least according to that of the mode. + Also respect alignment requirements of the type, and alignment + info that may be deduced from the expression itself. */ + unsigned int align = GET_MODE_ALIGNMENT (mode); + if (POINTER_TYPE_P (TREE_TYPE (loc))) + { + unsigned int talign = min_align_of_type (TREE_TYPE (TREE_TYPE (loc))); + align = MAX (align, talign * BITS_PER_UNIT); + } + set_mem_align (mem, MAX (align, get_pointer_alignment (loc))); set_mem_alias_set (mem, ALIAS_SET_MEMORY_BARRIER); MEM_VOLATILE_P (mem) = 1; Index: gcc/testsuite/gcc.target/s390/md/atomic_exchange-1.c =================================================================== --- gcc/testsuite/gcc.target/s390/md/atomic_exchange-1.c (revision 274142) +++ gcc/testsuite/gcc.target/s390/md/atomic_exchange-1.c (working copy) @@ -1,7 +1,7 @@ /* Machine description pattern tests. */ /* { dg-do compile } */ -/* { dg-options "-lpthread -latomic" } */ +/* { dg-options "-lpthread" } */ /* { dg-do run { target { s390_useable_hw } } } */ /**/ @@ -119,19 +119,21 @@ /**/ #ifdef __s390x__ +typedef __int128 __attribute__((aligned(16))) aligned_int128; + __int128 -ae_128_0 (__int128 *lock) +ae_128_0 (aligned_int128 *lock) { return __atomic_exchange_n (lock, 0, 2); } __int128 -ae_128_1 (__int128 *lock) +ae_128_1 (aligned_int128 *lock) { return __atomic_exchange_n (lock, 1, 2); } -__int128 g128; +aligned_int128 g128; __int128 ae_128_g_0 (void) @@ -274,7 +276,7 @@ #ifdef __s390x__ { - __int128 lock; + aligned_int128 lock; __int128 rval; lock = oval; -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com